Unnesting - Error: Can't combine <list> and <character>. from tibble column

I'm having trouble unnesting a tibble column I was hoping to get some help with. I've done my best to make a reprex but it is still quite packed because of the tibble column, please excuse me.

How can I change the nested pap_name column to specifically to allow for unnesting?

Edit: Simplified example after based on @technocrat's advice

library(tidyverse)
df <-
  structure(list(user_id = c("6bemf", "vb76d"), registration_results = list(
    structure(list(
      pap_name = list(NULL),
      pap_file_url = list(NULL)
    ), row.names = c(
      NA,
      -1L
    ), class = c("tbl_df", "tbl", "data.frame")), structure(list(
      pap_name = "", pap_file_url = ""
    ), row.names = c(NA, -1L), class = c(
      "tbl_df",
      "tbl", "data.frame"
    ))
  )), row.names = c(NA, -2L), class = c(
    "tbl_df",
    "tbl", "data.frame"
  ))
df %>%
  unnest(registration_results)
#> Error: Can't combine `..1$pap_name` <list> and `..2$pap_name` <character>.

Created on 2020-08-18 by the reprex package (v0.3.0)

is what the error message is complaining about. Behind that, there seems to be a complaint about applying nest to a function. I suggest simplifying to a test case to help isolate.

1 Like

I am wondering if I can use purrr to target pap_file_url and change it from list(NULL) to something like list("")?

Maybe. But first start with the tiniest step that will get you an object to explore further.

I guess you can think of the problem as

tibble(data = c(tibble(x = list(NULL)), 
                 tibble(x = ""))) %>% 
  unnest(data)
#> Error: Can't combine `..1$data` <list> and `..2$data` <character>.

although I'm not sure if there I am loosing too much information here. My guess is maintaining the list structure is going to be important to scale up

tibble(data = c(tibble(x = list(NULL)), 
                tibble(x = list("")))) %>% 
  unnest(data)

This code doesnt error

1 Like

So my open questions are:

  • how do I put the second row into a list and not the first?
  • How do I apply that only to the the nested x column and not the other columns (a la for the original data)?

Something like that is ugly but should work:

map_df(df$registration_results, ~ map_chr(.x, ~ if(is.list(.x)){""}else{.x}))

There are 2 nested maps:

  1. loop on the rows of the nested column
  2. loop on the columns of the tibble in each row of the nested column

And for each column look whether it's a character vector or a list(NULL). It may be slightly more robust that way:

map_df(df$registration_results, ~ map_chr(.x, ~ if(identical(.x, list(NULL))){""}else{.x}))

First I think could be obtained by modifying what I have above. Second, you can use a mutate with across, I guess, to treat list-columns only.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.