I am working with the R programming language.
I created some data:
#PART 1 #create data library(dplyr) library(caret) set.seed(123) salary <- rnorm(1000,5,5) height <- rnorm(1000,2,2) my_data = data.frame(salary, height) #PART 2 #create train and test data train<-sample_frac(my_data, 0.7) sid<-as.numeric(rownames(train)) # because rownames() returns character test<-my_data[-sid,] #PART 3 salary_quantiles = data.frame( train %>% summarise (quant_1 = quantile(salary, 0.33), quant_2 = quantile(salary, 0.66), quant_3 = quantile(salary, 0.99))) > salary_quantiles quant_1 quant_2 quant_3 1 3.005188 6.952076 16.98823
Question: Now, I am trying to write an IF STATEMENT which takes the quantiles (3.005188 6.952076 16.98823) and place them into the if statement (I did this manually):
#PART 4 train$salary_type = as.factor(ifelse(train$salary < 3.005188, "A", ifelse( train$salary > 3.005188 & train$salary < 6.952076, "B", "C")))
Does anyone know if there is a way to do this without writing these numbers explicitly? For example:
train$salary_type = as.factor(ifelse(train$salary < salary_quantiles$quant_1 , "A", ifelse( train$salary > salary_quantiles$quant_1 & train$salary < salary_quantiles$quant_2, "B", "C")))
Is this possible to do in R?