How to get rid of duplicated cols when combine lists of column specs? (readr)

Hi there,

I used readr::cols() to explicitly define column specifications in readr::read_csv(). As I progress with my project, new columns appear in new data files. Most of the time new column types are correctly guessed by the read_csv(), but not always. Therefore, I want to add column types for new columns in the existing list.

I used spec() to get the list of specifications and manually correct the column type when needed. Now I want to append the initial list with the new columns. In the documentation, I found this solution:

library(readr)

t1 <- cols(
  col1 = col_double(),
  col2 = col_double()
)

t2 <- cols(
  col3 = col_double(),
  col4 = col_double(),
  col2 = col_double()
)

t3 <- t1
t3$cols <- c(t1$cols, t2$cols)
t3
#> cols(
#>   col1 = col_double(),
#>   col2 = col_double(),
#>   col3 = col_double(),
#>   col4 = col_double(),
#>   col2 = col_double()
#> )

The problem is that now "col2" is duplicated. Is there a simple (non-manual) way to get rid of duplicated cols?

Not sure how to do this with cols type. But, if you can get this information instead into lists,


t1 <- list(
  col1 = col_double(),
  col2 = col_double()
)

t2 <- list(
  col3 = col_double(),
  col4 = col_double(),
  col2 = col_double()
)

t3 <- c(t1, t2)
t3 <- t3[unique(names(t3))]
t3 <- do.call(cols, t3)

Thank you, @arthur.t

Solution:

library(readr)

t1 <- cols(
  col1 = col_double(),
  col2 = col_double()
)

t2 <- cols(
  col3 = col_double(),
  col4 = col_double(),
  col2 = col_double()
)

t3 <- t1
t3$cols <- c(t1$cols, t2$cols)[(unique(names(c(t1$cols, t2$cols))))]
t3
#> cols(
#>   col1 = col_double(),
#>   col2 = col_double(),
#>   col3 = col_double(),
#>   col4 = col_double()
#> )
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.