Directly "Feed" Results into an IF STATEMENT

I am working with the R programming language.

I created some data:


#PART 1
#create data

library(dplyr)
library(caret)
set.seed(123)



salary <- rnorm(1000,5,5)
height <- rnorm(1000,2,2)
my_data = data.frame(salary, height)


#PART 2
#create train and test data


train<-sample_frac(my_data, 0.7)
sid<-as.numeric(rownames(train)) # because rownames() returns character
test<-my_data[-sid,]

#PART 3
salary_quantiles = data.frame( train %>% summarise (quant_1 = quantile(salary, 0.33),
quant_2 = quantile(salary, 0.66),
quant_3 = quantile(salary, 0.99)))



> salary_quantiles
   quant_1  quant_2  quant_3
1 3.005188 6.952076 16.98823

Question: Now, I am trying to write an IF STATEMENT which takes the quantiles (3.005188 6.952076 16.98823) and place them into the if statement (I did this manually):

#PART 4
train$salary_type = as.factor(ifelse(train$salary < 3.005188, "A", ifelse( train$salary  > 3.005188 & train$salary < 6.952076, "B", "C")))

Does anyone know if there is a way to do this without writing these numbers explicitly? For example:

train$salary_type = as.factor(ifelse(train$salary < salary_quantiles$quant_1 , "A", ifelse( train$salary > salary_quantiles$quant_1 & train$salary < salary_quantiles$quant_2, "B", "C")))

Is this possible to do in R?

Thanks!

Perhaps this?

salary_quantiles[[1]]


train$salary_type = as.factor(ifelse(train$salary < salary_quantiles[[1]], "A",
                                     ifelse( train$salary  > salary_quantiles[[1]] & train$salary < salary_quantiles[[2]], "B", "C")))

You could also use case_when() instead of ifelse in the second bit of code.

1 Like

Thank you for your answer! I tried similar logics for the following style of problem. Suppose you have this data set:

head(test)
salary height
701 1.358904 1.6148796
702 -2.702212 1.0604070
703 1.534527 -4.0957218
704 5.594247 5.7373110
705 -1.823547 5.5808484
706 7.949913 -0.2021635


test$salary_type = as.factor(ifelse(test$salary < salary_quantiles$quant_1 , "A", ifelse( test$salary  >  salary_quantiles$quant_1  & test$salary < salary_quantiles$quant_2, "B", "C")))

But then this does not work

 test$height_pred = as.factor(ifelse(test$salary_type == "A", height_quanitles[[1]], ifelse(test$salary_type == "B", height_quanitles[[2]], height_quanitles[[3]])))

Error in .subset2(x, i, exact = exact) : subscript out of bounds

Do you know why this returns an error?

Thanks!

  • Missing a ) at the end.
  • quantiles spelt incorrectly.
    test$salary_type - does not exist
1 Like