pivot_longer() on multiple column sets/pairs

Reprex below.

Hoping to see a cleaner solution with pivot_longer() than that illustrated below.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
#> Warning: package 'tidyr' was built under R version 3.5.3

# input
df_raw <- tibble::tribble(
  ~off_loc, ~pt_loc,     ~pt_lat,    ~off_lat,     ~pt_long,    ~off_long,
  "A",     "G", 100.0754822,  121.271083,  4.472089953,    -7.188632,
  "B",     "H",   75.191326, 75.93845266,  -144.387785, -143.2288569,
  "C",     "I", 122.6513448,  135.043791, -40.45611048,    21.242563,
  "D",     "J", 124.1355333,  134.511284, -46.07156181,    40.937417,
  "E",     "K", 124.1355333,  134.484374, -46.07156181,     40.78472,
  "F",     "L", 124.0102891,  137.962195, -46.01594293,    22.905889
)

# desired output
df_long <- tibble::tribble(
  ~loc,        ~lat,        ~long,
  "A",  121.271083,    -7.188632,
  "B", 75.93845266, -143.2288569,
  "C",  135.043791,    21.242563,
  "D",  134.511284,    40.937417,
  "E",  134.484374,     40.78472,
  "F",  137.962195,    22.905889,
  "G", 100.0754822,  4.472089953,
  "H",   75.191326,  -144.387785,
  "I", 122.6513448, -40.45611048,
  "J", 124.1355333, -46.07156181,
  "K", 124.1355333, -46.07156181,
  "L", 124.0102891, -46.01594293
)

# current solution
df_raw %>% 
  pivot_longer(cols = c(contains('lat'), contains('long'))) %>% 
  pivot_longer(cols = contains('loc'), names_to = 'loc', values_to = 'loc_val') %>% 
  filter(str_remove(name, '_(lat|long)') == str_remove(loc, '_(loc)')) %>% 
  mutate(name = str_remove(name, '(off|pt)_')) %>% 
  spread(name, value) %>% 
  select(-loc)

Created on 2019-11-04 by the reprex package (v0.2.1)

7 Likes

Can you explain why you want to use pivot_longer()? bind_rows() offers a sensible solution:

library(tidyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

df_raw <- tibble::tribble(
  ~off_loc, ~pt_loc,     ~pt_lat,    ~off_lat,     ~pt_long,    ~off_long,
  "A",     "G", 100.0754822,  121.271083,  4.472089953,    -7.188632,
  "B",     "H",   75.191326, 75.93845266,  -144.387785, -143.2288569,
  "C",     "I", 122.6513448,  135.043791, -40.45611048,    21.242563,
  "D",     "J", 124.1355333,  134.511284, -46.07156181,    40.937417,
  "E",     "K", 124.1355333,  134.484374, -46.07156181,     40.78472,
  "F",     "L", 124.0102891,  137.962195, -46.01594293,    22.905889
)

col_names <- c("loc", "lat", "long")
off <- select(df_raw, starts_with("off")) %>% setNames(col_names)
pt <- select(df_raw, starts_with("pt")) %>% setNames(col_names)
bind_rows(off, pt)
#> # A tibble: 12 x 3
#>    loc     lat    long
#>    <chr> <dbl>   <dbl>
#>  1 A     121.    -7.19
#>  2 B      75.9 -143.  
#>  3 C     135.    21.2 
#>  4 D     135.    40.9 
#>  5 E     134.    40.8 
#>  6 F     138.    22.9 
#>  7 G     100.     4.47
#>  8 H      75.2 -144.  
#>  9 I     123.   -40.5 
#> 10 J     124.   -46.1 
#> 11 K     124.   -46.1 
#> 12 L     124.   -46.0

Created on 2019-11-04 by the reprex package (v0.3.0)

You're also mixing pivot_longer() with spread(). From {tidyr} v1.0.0, the preferred function for 'widening' tables is pivot_wider().

1 Like

Thanks for reply @matt.

Indeed, bind_rows was my first go-to, but having to select() individually is troublesome :sweat_smile:.

Guess we'd have to sacrifice elegance for pace sometimes.

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

We're working on adding boolean operators to select(), which should make this less painful! The tidyselect package is the backend for tidyr::select(), and you can see a bit more about that below:

I was excited to see a reply from you @cderv, as I find you're verryyy good at solving these puzzles.

I'm still looking for a tidier pivot_longer() solution, hence the decision to not close this yet.

1 Like

@mara: thanks for this; I saw this in a tweet recently and am excited :slight_smile:

However, I still don't see how this can help with the task at hand :sweat_smile:

Oh, I must have read too quickly, I thought when you said:

you were concerned about using multiple select statements (which I now see is inherent to the previous solution).

Indeed.

I was hoping pivot_longer() can take a list as its cols, names_to and values_to arguments, such that user can map what goes where, instead of having 2 pivot_longer() as per original post :grin:

You need the names_to and names_pattern arguments, and to understand the .value sentinel. It took me some experimentation to figure this out, but the bottom two examples in ?pivot_longer() helped.

library(tidyr)
library(dplyr)

df_raw <- tibble::tribble(
  ~off_loc, ~pt_loc,     ~pt_lat,    ~off_lat,     ~pt_long,    ~off_long,
  "A",     "G", 100.0754822,  121.271083,  4.472089953,    -7.188632,
  "B",     "H",   75.191326, 75.93845266,  -144.387785, -143.2288569,
  "C",     "I", 122.6513448,  135.043791, -40.45611048,    21.242563,
  "D",     "J", 124.1355333,  134.511284, -46.07156181,    40.937417,
  "E",     "K", 124.1355333,  134.484374, -46.07156181,     40.78472,
  "F",     "L", 124.0102891,  137.962195, -46.01594293,    22.905889
)

df_raw %>%
  pivot_longer(everything(),
  names_to = c("set", ".value"),
  names_pattern = "(.+)_(.+)"
  )
# # A tibble: 12 x 4
#    set   loc     lat    long
#    <chr> <chr> <dbl>   <dbl>
#  1 off   A     121.    -7.19
#  2 pt    G     100.     4.47
#  3 off   B      75.9 -143.
#  4 pt    H      75.2 -144.
#  5 off   C     135.    21.2
#  6 pt    I     123.   -40.5
#  7 off   D     135.    40.9
#  8 pt    J     124.   -46.1
#  9 off   E     134.    40.8
# 10 pt    K     124.   -46.1
# 11 off   F     138.    22.9
# 12 pt    L     124.   -46.0
19 Likes

Wowww :exploding_head:

The .value quirk is something I'm still trying to wrap my head around; you just enlightened me :nerd_face:

Thanks @nacnudus!

1 Like

@nacnudus beats me to it!

Also, when pivot_longer or pivot_wider seems limited, you can also use the power of manual spec
https://tidyr.tidyverse.org/articles/pivot.html#manual-specs

build_longer_spec and pivot_longer_spec can help deal with pretty complex cases !

6 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.