Separate one column in two

I got a column in my data frame that I want to split in two based in the items in this main column, it is like:
GENDER
M
M
M
F
F
M
M
F
F
F
F
F
M
M
F
M
M
M
F
F
F
F
F
M
M
M
M
M
M
M
M
M
M
M
M
F
F
M

I want to create two new columns (MALE and FEMALE) based on this main column.
I'm realy beginner in this so if anyone can help me with this I'll be very grateful.

Here is one way to do that.

DF <- data.frame(BothSexes = sample(c("F", "M"), 10, replace = TRUE))
DF
#>    BothSexes
#> 1          F
#> 2          M
#> 3          M
#> 4          F
#> 5          M
#> 6          F
#> 7          F
#> 8          M
#> 9          M
#> 10         M
DF$Female = ifelse(DF$BothSexes == "F", 1, 0)
DF$Male = ifelse(DF$BothSexes == "M", 1, 0)
DF
#>    BothSexes Female Male
#> 1          F      1    0
#> 2          M      0    1
#> 3          M      0    1
#> 4          F      1    0
#> 5          M      0    1
#> 6          F      1    0
#> 7          F      1    0
#> 8          M      0    1
#> 9          M      0    1
#> 10         M      0    1

Created on 2020-03-07 by the reprex package (v0.3.0)

Hi,

Welcome to the RStudio community!

Here is one way if doing this:

#Generate some data
myData = data.frame(GENDER = sample(c("M", "F"), 20, replace = T))

#Create the male and female columns
myData$MALE = myData$GENDER == "M"
myData$FEMALE = myData$GENDER == "F"

head(myData)
  GENDER  MALE FEMALE
1      M  TRUE  FALSE
2      F FALSE   TRUE
3      F FALSE   TRUE
4      F FALSE   TRUE
5      M  TRUE  FALSE
6      M  TRUE  FALSE

Hope this helps,
PJ

EDIT: I see @FJCC and I posted at the same time :slight_smile:

1 Like

Hi @artfer93, could you post your data frame, and maybe say a little more about how you want to split it up? The easiest way to post it to apply the function dput() to your data frame, and then paste the output here, between a pair of triple backitcks (```), like this:

```
[paste output of dput() here]
```

That might help folks understand your situation better.

1 Like

Hi @artfer93,

# Load library
library("tidyverse")

# Create some data
d <- tibble(id = str_c("id_",
                       seq(from = 1, to = 100)),
            sex = sample(x = c("female", "male"),
                         size = 100,
                         replace = TRUE),
            var_1 = rnorm(100))

Yielding:

> d
# A tibble: 100 x 3
   id    sex     var_1
   <chr> <chr>   <dbl>
 1 id_1  male    1.50 
 2 id_2  male   -0.514
 3 id_3  male   -0.878
 4 id_4  male    1.30 
 5 id_5  female -0.212
 6 id_6  male   -0.878
 7 id_7  female -0.291
 8 id_8  male   -1.13 
 9 id_9  female  0.786
10 id_10 female -1.19 
# … with 90 more rows

Now, split sex into two columns

d_sex_split <- d %>%
  pivot_wider(id_cols = id,
              names_from = sex,
              values_from = var_1)

yielding:

> d_sex_split
# A tibble: 100 x 3
   id      male female
   <chr>  <dbl>  <dbl>
 1 id_1   1.50  NA    
 2 id_2  -0.514 NA    
 3 id_3  -0.878 NA    
 4 id_4   1.30  NA    
 5 id_5  NA     -0.212
 6 id_6  -0.878 NA    
 7 id_7  NA     -0.291
 8 id_8  -1.13  NA    
 9 id_9  NA      0.786
10 id_10 NA     -1.19 
# … with 90 more rows

Hope it helps :slightly_smiling_face:

1 Like

Thank you so much, this is hard in the beginning and I realy want to learn.

You're welcome @artfer93. If you really want to learn and you're willing to put in the effort, I would HIGHLY recommend, that you head on over to R4DS and go back to back :+1:

Also, if my answer is the right one for you, please mark it as solution

for completeness, here's the data.table way.
The syntax is more obscure, but data.table is much faster than base R or tidyverse.

library(data.table)

#Generate some data
myData = data.table(GENDER = sample(c("M", "F"), 20, replace = T))

#Create the male and female columns
myData[,c('MALE','FEMALE') := .(GENDER == "M", GENDER == "F")]

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.