Recoding mutilple columns in 1 shot


#1

Hi ,

I want to recode multiple fields for e.g. gender,marital status ,education to numeric for eg. male=1 ,female=2 & so on

convert <-function(x,y,z)
{
if(x$y==z)
{x$y=as.factor(1)}

else{
x$y=as.factor(2)
}

}
train$Gender<-convert(train,Gender,“Male”)

But this throws an error { : argument is of length zero which I believe is because the gender column is character. Can some one help out?

Thanks in advance.


#2

The function has a few issues:

  • It needs to return something at the end (probably x). See ?return.
  • We cannot write x$y programmatically passing y, unfortunately. Instead x[[y]] <-should work where y is a string column name with quotes.
  • if and else are for control flow (determining which pieces of code get run), while you you want this to apply in a vectorized way to each element of x$y.

I’d suggest approaching this with a lookup table and a left join. The tidyverse way is shown here:

This also looks related, using case_when as a substitute for if/else: