Among a subset of columns, mutate only those which are characters

Hi,

I have a large dataframe and I want to modify only the variables in a certain subset. Of the variables in the subset, I want to convert to factor only those which are character. All the columns not in the subset must remain untouched. Let's make an example:

library(dplyr)
library(tibble)
library(magrittr)

# these lines create the example dataframe
new_mtcars <- rownames_to_column(mtcars)
new_mtcars %<>% mutate_at(vars(cyl,disp, hp, vs, am, gear, carb), as.character) 

Now I have a data frame with columns of type character or numeric. Suppose this is the subset of variables which can be modified:

variables_subset <- c( "drat" , "wt", "qsec",  "vs", "am", "gear", "carb")  

Of these variables, only those of type character must be mutated to factor. I cannot use select, because otherwise all the columns not in variables_subset are dropped, while instead I want to retain them. For this reason I write

new_mtcars[, variables_subset ] %<>% mutate_if(is.character, is.factor)

But the result is not what I was looking for!

Observations: 32
Variables: 12
$ rowname <chr> "Mazda RX4", "Mazda RX4 Wag", "Datsun 710", "Hornet 4 Drive...
$ mpg     <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2,...
$ cyl     <chr> "6", "6", "4", "6", "8", "6", "8", "4", "4", "6", "6", "8",...
$ disp    <chr> "160", "160", "108", "258", "360", "225", "360", "146.7", "...
$ hp      <chr> "110", "110", "93", "110", "175", "105", "245", "62", "95",...
$ drat    <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92,...
$ wt      <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.1...
$ qsec    <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22....
$ vs      <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL...
$ am      <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL...
$ gear    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL...
$ carb    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL...

Instead than type factor, the variables are of type logical. What am I doing wrong?

You are using is.factor, which is returning false, because the column is.character! You want to use factor, instead.

4 Likes

Ouch! So obvious....I don't even know why I used is.factor, probably shouldn't code on a Friday evening :slight_smile: thanks!

PS ok, now I understand what I was trying to do...

new_mtcars[, variables_subset ] %<>% mutate_if(is.character, as.factor)

but for some reason I went on autopilot and used is.factor instead than as.factor!

1 Like