Combining rows with the same value into one column

Hi everyone,

I'm fairly new to R, but so far it's proved a wonderful tool. I've had great luck finding the answers on how to do various tasks online, but now I'm stuck. I've read various instructionals for hours and have gotten no closer to solving my problem. I feel like this is an easy question, but I just can't get it; so I apologize.

Here is what I need to do. Imagine I have data with patients who have had their blood pressure taken. I have a long format dataset for this. So for example.

Name Date Systolic Diastolic
John 1/1/10 132 92
Amy 1/2/10 120 80
John 2/2/10 135 92
Amy 2/2/10 82 118
Nick 2/14/10 108 72
Amy 3/1/10 122 80
John 3/3/10 128 88

Here is what I need:

Name Date1 Systolic1 Diastolic1 Date2 Systolic2 Diastolic2 Date3 Systolic3 Diastolic3
John 1/1/10 132 92 2/2/10 135 92 3/3/10 128 88
Amy 1/2/10 120 80 2/2/10 82 118 3/1/10 122 80
Nick 2/14/10 108 72

I've tried all sorts of things with pivot_wider and group_by, but I've gotten nowhere.

I appreciate any help. Thanks SO much,

Luke

Hi @beginnersluke,

It is strongly recommended to keep your data as "long" for most tasks. It makes it much easier to plot, clean, etc.

However, since I don't know how you plan to use the data, here is a solution anyway:

library(tidyverse)
library(lubridate)

data <- 
tribble(
  ~Name, ~Date, ~Systolic, ~Diastolic,
  'John',   '1/1/10',   132,    92,
  'Amy',    '1/2/10',   120,    80,
  'John',   '2/2/10',   135,    92,
  'Amy',    '2/2/10',   82, 118,
  'Nick',   '2/14/10',  108,    72,
  'Amy',    '3/1/10',   122,    80,
  'John',   '3/3/10',   128,    88
)

data %>% 
  mutate(Date = mdy(Date)) %>% 
  arrange(Name, Date) %>% 
  group_by(Name) %>% 
  mutate(Visit = 1:n()) %>% 
  ungroup() %>% 
  pivot_wider(
    names_from = Visit, 
    values_from = c(Date, Systolic, Diastolic),
    names_sep = ''
  )
#> # A tibble: 3 x 10
#>   Name  Date1      Date2      Date3      Systolic1 Systolic2 Systolic3
#>   <chr> <date>     <date>     <date>         <dbl>     <dbl>     <dbl>
#> 1 Amy   2010-01-02 2010-02-02 2010-03-01       120        82       122
#> 2 John  2010-01-01 2010-02-02 2010-03-03       132       135       128
#> 3 Nick  2010-02-14 NA         NA               108        NA        NA
#> # … with 3 more variables: Diastolic1 <dbl>, Diastolic2 <dbl>, Diastolic3 <dbl>

Created on 2020-02-12 by the reprex package (v0.3.0)

3 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.