bababa
1
I am looking for a solution to display all levels of a factor in new columns with the values 0 1.
Example:
df<-data.frame(gender=c("m","m","m","m","f"));df
gender
1 f
2 f
3 f
4 f
5 m
should result:
df_neu<-data.frame(m=c(1,1,1,1,0),f=c(0,0,0,0,1));df_neu
m f
1 1 0
2 1 0
3 1 0
4 1 0
5 0 1
Does anyone have a solution to automate this? If possible without loop.
Thanks!
Flm
2
library(tidyverse)
df <- data.frame(gender=c("m","m","m","m","f"))
df <- df %>%
mutate(m = ifelse(gender == "m", 1, 0),
f = ifelse(gender == "f", 1, 0)) %>%
select(-gender) %>%
mutate_all(as.factor)
glimpse(df)
Here's a way that works with more than 2 genders without changing anything:
library(tidyverse)
df<-data.frame(gender=c("m","m","m","m","f", "d"))
df %>%
mutate(
value = 1,
id = row_number()
) %>%
pivot_wider(names_from = 'gender', values_from = 'value', values_fill = 0) %>%
select(-id)
#> # A tibble: 6 × 3
#> m f d
#> <dbl> <dbl> <dbl>
#> 1 1 0 0
#> 2 1 0 0
#> 3 1 0 0
#> 4 1 0 0
#> 5 0 1 0
#> 6 0 0 1
Created on 2022-10-23 with reprex v2.0.2
There are also several ways with other packages. You might search for dummy encoding or one-hot-encoding.
2 Likes
Flm
5
So rstats.tips's answer is what you are looking for
2 Likes
system
Closed
6
This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.