restructuring the data

Hi everyone.

Below is a sample dataset where a group of respondents were surveyed several times across different time periods.

ID Current_Age Response_2015 Response_2010 Response_2007 Response_2005
001 75 Yes No Yes Yes
002 38 No No No Yes
003 29 Yes Yes Yes
004 45 Yes No No
005 47 No Yes No

I need to convert the data in long format. How do I do it especially the age column when I have to make sure it is the right age of the respondent at the time they responded in the survey? The new dataset should look like the one below.

ID Age Year Response
001 75 2015 Yes
002 38 2015 No
003 29 2015 Yes
004 45 2015 Yes
005 47 2015 No
001 70 2010 No
002 33 2010 No
003 24 2010 Yes
004 40 2010
005 42 2010 Yes
001 67 2007 Yes
002 30 2007 No
003 21 2007 Yes
004 37 2007 No
005 39 2007
001 65 2005 Yes
002 28 2005 Yes
003 19 2005
004 35 2005 No
005 37 2005 No

Appreciate any help.

Yoyong

This is one way to do it

library(tidyverse)

# Sample data in a copy/paste friendly format, replace this with your own data frame
sample_df <- data.frame(
  stringsAsFactors = FALSE,
                ID = c("001", "002", "003", "004", "005"),
       Current_Age = c(75, 38, 29, 45, 47),
     Response_2015 = c("Yes", "No", "Yes", "Yes", "No"),
     Response_2010 = c("No", "No", "Yes", NA, "Yes"),
     Response_2007 = c("Yes", "No", "Yes", "No", NA),
     Response_2005 = c("Yes", "Yes", NA, "No", "No")
)

# Relevant code
sample_df %>% 
    pivot_longer(cols = starts_with("Response"),
                 names_to = "Year",
                 names_pattern = "Response_(.+)",
                 names_transform = list(Year = as.integer),
                 values_to = "Response") %>% 
    group_by(ID) %>% 
    mutate(Response_Age = Current_Age - (max(Year)-Year)) %>% 
    arrange(desc(Year), ID)
#> # A tibble: 20 × 5
#> # Groups:   ID [5]
#>    ID    Current_Age  Year Response Response_Age
#>    <chr>       <dbl> <int> <chr>           <dbl>
#>  1 001            75  2015 Yes                75
#>  2 002            38  2015 No                 38
#>  3 003            29  2015 Yes                29
#>  4 004            45  2015 Yes                45
#>  5 005            47  2015 No                 47
#>  6 001            75  2010 No                 70
#>  7 002            38  2010 No                 33
#>  8 003            29  2010 Yes                24
#>  9 004            45  2010 <NA>               40
#> 10 005            47  2010 Yes                42
#> 11 001            75  2007 Yes                67
#> 12 002            38  2007 No                 30
#> 13 003            29  2007 Yes                21
#> 14 004            45  2007 No                 37
#> 15 005            47  2007 <NA>               39
#> 16 001            75  2005 Yes                65
#> 17 002            38  2005 Yes                28
#> 18 003            29  2005 <NA>               19
#> 19 004            45  2005 No                 35
#> 20 005            47  2005 No                 37

Created on 2021-10-27 by the reprex package (v2.0.1)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.