Pivot_longer paired columns

Hello,

I have time series data (unequal intervals) for several samples and am trying to use pivot_longer to create a cleaned data set for plotting/analysis. My data is paired by columns (simplified example below):

>d=data.frame(list(RT=c(9.803,9.955,10.131,10.168,10.226),H48=c(3.57,20.35,8.15,1.39,5.42),RT.1=c(9.951,10.226,10.401,10.431,10.481),H49=c(34.06,8.81,4.57,5.78,14.96)))

> d
      RT   H48   RT.1   H49
1  9.803  3.57  9.951 34.06
2  9.955 20.35 10.226  8.81
3 10.131  8.15 10.401  4.57
4 10.168  1.39 10.431  5.78
5 10.226  5.42 10.481 14.96

The columns beginning with "RT" are times for the datapoints in the adjacent column to the right. I want to create a longer data set that looks like:

      RT   H48   H49
1  9.803  3.57    NA
2  9.951    NA 34.06
3  9.955 20.35    NA
4 10.131  8.15    NA
5 10.168  1.39    NA
6 10.226  5.42  8.81
7 10.401    NA  4.57
8 10.431    NA  5.78
9 10.481    NA 14.96

However, when I try to use pivot_longer, I can't figure out how to specify that my columns are paired. I instead get:

>pivot_longer(d,starts_with("RT"),values_to="RT")

# A tibble: 10 x 4
     H48   H49 name     RT
   <dbl> <dbl> <chr> <dbl>
 1  3.57 34.1  RT     9.80
 2  3.57 34.1  RT.1   9.95
 3 20.4   8.81 RT     9.96
 4 20.4   8.81 RT.1  10.2 
 5  8.15  4.57 RT    10.1 
 6  8.15  4.57 RT.1  10.4 
 7  1.39  5.78 RT    10.2 
 8  1.39  5.78 RT.1  10.4 
 9  5.42 15.0  RT    10.2 
10  5.42 15.0  RT.1  10.5 

Is this possible or am I maybe using the wrong function to get my desired result?

I would say you don't need tidyr functions here. You can just run:

d_unpaired <- bind_rows(
    select(d, RT, H48),
    select(d, RT = RT.1, H49)
) %>%
    arrange(RT)

If it is important to have records for 10.226 united you can also add the following:

d_unpaired <- bind_rows(
    select(d, RT, H48),
    select(d, RT = RT.1, H49)
) %>%
    arrange(RT) %>%
    group_by(RT) %>%
    summarise_all(function(x) na.omit(x)[1]) %>%
    ungroup()

The result would be:

> d_unpaired
# A tibble: 9 x 3
     RT   H48   H49
* <dbl> <dbl> <dbl>
1  9.80  3.57 NA   
2  9.95 NA    34.1 
3  9.96 20.4  NA   
4 10.1   8.15 NA   
5 10.2   1.39 NA   
6 10.2   5.42  8.81
7 10.4  NA     4.57
8 10.4  NA     5.78
9 10.5  NA    15.0 
1 Like

This worked for me, thank you! Can you explain what the summarise_all line is doing? When I remove it I can't find a difference but may not be looking at the results closely enough.

Without it you'll have separate records for the same time:

6  10.226  5.42    NA
7  10.226    NA  8.81

So to have records for 10.226 united we group_by RT and taking non missing values for all other columns.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.