Data wrangling and arranging

Hello!

New to the community, and data wrangling in general!

I have a large data set with student information that I need to arrange into a specific upload format. I need one row per student, and to collapse all the repeating rows into columns of sequential information.

This is an example of the data I have:

student <- c('a', 'a', 'b', 'c', 'c', 'd')
contact <- c( 'mother', 'father', 'mother', 'grandmother', 'grandfather', 'aunt')
address <- c( '115', '115', '75th', 'Pacific', 'Atlantic', 'georgetown')
example <- tibble(student, contact, address)

However, I need it to look like this:

student1 <- c('a', 'b', 'c', 'd')
contact1 <- c('mother','mother','grandmother', 'aunt')
address1 <- c('115', '75th', 'pacific', 'georgetown')
contact2 <- c('father', 'na', 'grandfather', 'na')
address2 <- c('115', 'na', 'Altantic', 'na')
example2 <- tibble(student1, contact1, address1, contact2, address2)

Any help is appreciated!

This is called 'pivoting'. The pivoting functions are in the tidyr package.

library(tidyr) # for pivot_wider
library(dplyr)

student <- c('a', 'a', 'b', 'c', 'c', 'd')
contact <- c( 'mother', 'father', 'mother', 'grandmother', 'grandfather', 'aunt')
address <- c( '115', '115', '75th', 'Pacific', 'Atlantic', 'georgetown')
example <- tibble(student, contact, address)

example2 <- example %>%
  group_by(student) %>% 
  mutate(
    newcol = 1:n() # new column numbers
  ) %>% 
  pivot_wider(names_from = "newcol", values_from = c("contact", "address"))

example2
#> # A tibble: 4 x 5
#> # Groups:   student [4]
#>   student contact_1   contact_2   address_1  address_2
#>   <chr>   <chr>       <chr>       <chr>      <chr>    
#> 1 a       mother      father      115        115      
#> 2 b       mother      <NA>        75th       <NA>     
#> 3 c       grandmother grandfather Pacific    Atlantic 
#> 4 d       aunt        <NA>        georgetown <NA>

Created on 2021-02-05 by the reprex package (v1.0.0)

3 Likes

In agreement with @woodward, you an definitely do this by reshaping your data.
The alternative to the tidyverse would be the base R package reshape2, using dcast (and melt for pivot_longer)

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.