Using list argument with `fct_collapse`

I'm trying to collapse a long list of levels for a factor variable, and for neatness's sake would rather create a named list outside of my munging pipe, but I can't seem to get this to work with fct_collapse even though there's nothing in the code that would suggest that it shouldn't.

library(tidyverse)
library(forcats)
data(gss_cat)

party_change <- lst(
  missing = c("No answer", "Don't know"),
  other = "Other party",
  rep = c("Strong republican", "Not str republican"),
  ind = c("Ind,near rep", "Independent", "Ind,near dem"),
  dem = c("Not str democrat", "Strong democrat")
)

partyid2 <- fct_collapse(gss_cat$partyid, party_change)

fct_count(partyid2)
# A tibble: 1 x 2
  f         n
  <fct> <int>
1 ""    21483

Any suggestions for how I might be able to get this to work? TIA!

Hi @queermath,

Somehow tibble::lst() seems to be not working in this case. I've tried with list() instead and worked as expected. Hope it helps.

What version of forcats are you using? Perhaps try installing the dev version?

devtools::install_github("tidyverse/forcats")
suppressPackageStartupMessages(library(tidyverse))

fct_count(gss_cat$partyid)
#> # A tibble: 10 x 2
#>    f                      n
#>    <fct>              <int>
#>  1 No answer            154
#>  2 Don't know             1
#>  3 Other party          393
#>  4 Strong republican   2314
#>  5 Not str republican  3032
#>  6 Ind,near rep        1791
#>  7 Independent         4119
#>  8 Ind,near dem        2499
#>  9 Not str democrat    3690
#> 10 Strong democrat     3490

partyid2 <- fct_collapse(gss_cat$partyid,
                         missing = c("No answer", "Don't know"),
                         other = "Other party",
                         rep = c("Strong republican", "Not str republican"),
                         ind = c("Ind,near rep", "Independent", "Ind,near dem"),
                         dem = c("Not str democrat", "Strong democrat")
)
fct_count(partyid2)
#> # A tibble: 5 x 2
#>   f           n
#>   <fct>   <int>
#> 1 missing   155
#> 2 other     393
#> 3 rep      5346
#> 4 ind      8409
#> 5 dem      7180

Created on 2018-03-26 by the reprex package (v0.2.0).

1 Like

Definitely up-to-date with forcats--even tried the dev version--but your reprex doesn't recreate my problem: I'm trying to separately declare a named list of factors/replacements and use that as an argument in fct_collapse instead of calling each replacement explicitly.

Also tried using list instead of lst as per @floresf but am getting the same result (namely, all levels get replaced with an empty string, "".

Whoops, just realised you're trying to go in the opposite direction with this!

Summary

Oh, oh! Sorry, I misunderstood. I'm still not 100% clear on what you're aiming for here (sorry, it's been a long day). But, you can use fct_c() with !!! to splice a list of factors you want to combine.

suppressPackageStartupMessages(library(tidyverse))
library(forcats)

party_change <- list(
  missing = factor(c("No answer", "Don't know")),
  other = factor("Other party"),
  rep = factor(c("Strong republican", "Not str republican")),
  ind = factor(c("Ind,near rep", "Independent", "Ind,near dem")),
  dem = factor(c("Not str democrat", "Strong democrat"))
)

fct_c(!!!party_change)
#>  [1] No answer          Don't know         Other party       
#>  [4] Strong republican  Not str republican Ind,near rep      
#>  [7] Independent        Ind,near dem       Not str democrat  
#> [10] Strong democrat   
#> 10 Levels: Don't know No answer Other party ... Strong democrat

Created on 2018-03-26 by the reprex package (v0.2.0).

Got it working with a 1:1 named list and dplyr::recode()

suppressPackageStartupMessages(library(tidyverse))
library(forcats)

party_change <- list("No answer" = "missing", "Don't know" = "missing", 
                     "Other party" = "other", "Strong republican" = "rep", 
                     "Not str republican" = "rep", "Ind,near rep" = "ind", 
                     "Independent" = "ind", "Ind,near dem" = "ind", 
                     "Not str democrat" = "dem", "Strong democrat" = "dem")

partyid2 <- dplyr::recode(gss_cat$partyid, !!!party_change)

fct_count(partyid2)
#> # A tibble: 5 x 2
#>   f           n
#>   <fct>   <int>
#> 1 missing   155
#> 2 other     393
#> 3 rep      5346
#> 4 ind      8409
#> 5 dem      7180

Created on 2018-03-26 by the reprex package (v0.2.0).

1 Like

Thanks for keeping at it with this!! I imagine the 1:1 list might work with fct_recode as well (though I haven't tried it)--I was trying to avoid doing a 1:1 since I had about 100 factors I was recoding into about 17 factors, and at that point might as well just make a find/replace table...at any rate, I ended up just doing the named arguments in the body of the fct_collapse function. Not pretty but it works :woman_shrugging:

2 Likes

Great solution, @mara! :clap:

I was wrong on my previous answer. I guess it's a long day for me after all :slight_smile: Sorry about that!

Just for reference, fct_collapse() is expecting a series of named character vectors, as you can see by the ... parameter of the function. So a named list is not appropriate and some manual conversion is needed, at least in the current version of forcats.

I had cross-posted this in the R4DS slack chat and got this solution, which works & doesn't involve a 1:1 list!

 args = list(gss_cat$partyid,
   missing = c("No answer", "Don't know"),
   other = "Other party",
   rep = c("Strong republican", "Not str republican"),
   ind = c("Ind,near rep", "Independent", "Ind,near dem"),
   dem = c("Not str democrat", "Strong democrat"))
partyid2 = do.call(fct_collapse, args)
3 Likes