convert factor in colums (data.frame)

I am looking for a solution to display all levels of a factor in new columns with the values 0 1.

Example:

df<-data.frame(gender=c("m","m","m","m","f"));df
  gender
1      f
2      f
3      f
4      f
5      m

should result:

df_neu<-data.frame(m=c(1,1,1,1,0),f=c(0,0,0,0,1));df_neu
  m f
1 1 0
2 1 0
3 1 0
4 1 0
5 0 1

Does anyone have a solution to automate this? If possible without loop.

Thanks!

library(tidyverse)

df <- data.frame(gender=c("m","m","m","m","f"))

df <- df %>% 
  mutate(m = ifelse(gender == "m", 1, 0),
         f = ifelse(gender == "f", 1, 0)) %>% 
  select(-gender) %>% 
  mutate_all(as.factor)

glimpse(df)

Here's a way that works with more than 2 genders without changing anything:

library(tidyverse)
df<-data.frame(gender=c("m","m","m","m","f", "d"))

df %>% 
  mutate(
    value = 1,
    id = row_number()
  ) %>% 
  pivot_wider(names_from = 'gender', values_from = 'value', values_fill = 0) %>% 
  select(-id)
#> # A tibble: 6 × 3
#>       m     f     d
#>   <dbl> <dbl> <dbl>
#> 1     1     0     0
#> 2     1     0     0
#> 3     1     0     0
#> 4     1     0     0
#> 5     0     1     0
#> 6     0     0     1

Created on 2022-10-23 with reprex v2.0.2

There are also several ways with other packages. You might search for dummy encoding or one-hot-encoding.

2 Likes

So rstats.tips's answer is what you are looking for

1 Like

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.