Using list argument with `fct_collapse`

forcats

#1

I'm trying to collapse a long list of levels for a factor variable, and for neatness's sake would rather create a named list outside of my munging pipe, but I can't seem to get this to work with fct_collapse even though there's nothing in the code that would suggest that it shouldn't.

library(tidyverse)
library(forcats)
data(gss_cat)

party_change <- lst(
  missing = c("No answer", "Don't know"),
  other = "Other party",
  rep = c("Strong republican", "Not str republican"),
  ind = c("Ind,near rep", "Independent", "Ind,near dem"),
  dem = c("Not str democrat", "Strong democrat")
)

partyid2 <- fct_collapse(gss_cat$partyid, party_change)

fct_count(partyid2)
# A tibble: 1 x 2
  f         n
  <fct> <int>
1 ""    21483

Any suggestions for how I might be able to get this to work? TIA!


#2

Hi @queermath,

Somehow tibble::lst() seems to be not working in this case. I've tried with list() instead and worked as expected. Hope it helps.


#3

What version of forcats are you using? Perhaps try installing the dev version?

devtools::install_github("tidyverse/forcats")
suppressPackageStartupMessages(library(tidyverse))

fct_count(gss_cat$partyid)
#> # A tibble: 10 x 2
#>    f                      n
#>    <fct>              <int>
#>  1 No answer            154
#>  2 Don't know             1
#>  3 Other party          393
#>  4 Strong republican   2314
#>  5 Not str republican  3032
#>  6 Ind,near rep        1791
#>  7 Independent         4119
#>  8 Ind,near dem        2499
#>  9 Not str democrat    3690
#> 10 Strong democrat     3490

partyid2 <- fct_collapse(gss_cat$partyid,
                         missing = c("No answer", "Don't know"),
                         other = "Other party",
                         rep = c("Strong republican", "Not str republican"),
                         ind = c("Ind,near rep", "Independent", "Ind,near dem"),
                         dem = c("Not str democrat", "Strong democrat")
)
fct_count(partyid2)
#> # A tibble: 5 x 2
#>   f           n
#>   <fct>   <int>
#> 1 missing   155
#> 2 other     393
#> 3 rep      5346
#> 4 ind      8409
#> 5 dem      7180

Created on 2018-03-26 by the reprex package (v0.2.0).


#4

Definitely up-to-date with forcats--even tried the dev version--but your reprex doesn't recreate my problem: I'm trying to separately declare a named list of factors/replacements and use that as an argument in fct_collapse instead of calling each replacement explicitly.

Also tried using list instead of lst as per @floresf but am getting the same result (namely, all levels get replaced with an empty string, "".


#5

Whoops, just realised you're trying to go in the opposite direction with this!

Summary

Oh, oh! Sorry, I misunderstood. I'm still not 100% clear on what you're aiming for here (sorry, it's been a long day). But, you can use fct_c() with !!! to splice a list of factors you want to combine.

suppressPackageStartupMessages(library(tidyverse))
library(forcats)

party_change <- list(
  missing = factor(c("No answer", "Don't know")),
  other = factor("Other party"),
  rep = factor(c("Strong republican", "Not str republican")),
  ind = factor(c("Ind,near rep", "Independent", "Ind,near dem")),
  dem = factor(c("Not str democrat", "Strong democrat"))
)

fct_c(!!!party_change)
#>  [1] No answer          Don't know         Other party       
#>  [4] Strong republican  Not str republican Ind,near rep      
#>  [7] Independent        Ind,near dem       Not str democrat  
#> [10] Strong democrat   
#> 10 Levels: Don't know No answer Other party ... Strong democrat

Created on 2018-03-26 by the reprex package (v0.2.0).


#6

Got it working with a 1:1 named list and dplyr::recode()

suppressPackageStartupMessages(library(tidyverse))
library(forcats)

party_change <- list("No answer" = "missing", "Don't know" = "missing", 
                     "Other party" = "other", "Strong republican" = "rep", 
                     "Not str republican" = "rep", "Ind,near rep" = "ind", 
                     "Independent" = "ind", "Ind,near dem" = "ind", 
                     "Not str democrat" = "dem", "Strong democrat" = "dem")

partyid2 <- dplyr::recode(gss_cat$partyid, !!!party_change)

fct_count(partyid2)
#> # A tibble: 5 x 2
#>   f           n
#>   <fct>   <int>
#> 1 missing   155
#> 2 other     393
#> 3 rep      5346
#> 4 ind      8409
#> 5 dem      7180

Created on 2018-03-26 by the reprex package (v0.2.0).


#7

Thanks for keeping at it with this!! I imagine the 1:1 list might work with fct_recode as well (though I haven't tried it)--I was trying to avoid doing a 1:1 since I had about 100 factors I was recoding into about 17 factors, and at that point might as well just make a find/replace table...at any rate, I ended up just doing the named arguments in the body of the fct_collapse function. Not pretty but it works :woman_shrugging:


#8

Great solution, @mara! :clap:

I was wrong on my previous answer. I guess it's a long day for me after all :slight_smile: Sorry about that!

Just for reference, fct_collapse() is expecting a series of named character vectors, as you can see by the ... parameter of the function. So a named list is not appropriate and some manual conversion is needed, at least in the current version of forcats.


#9

I had cross-posted this in the R4DS slack chat and got this solution, which works & doesn't involve a 1:1 list!

 args = list(gss_cat$partyid,
   missing = c("No answer", "Don't know"),
   other = "Other party",
   rep = c("Strong republican", "Not str republican"),
   ind = c("Ind,near rep", "Independent", "Ind,near dem"),
   dem = c("Not str democrat", "Strong democrat"))
partyid2 = do.call(fct_collapse, args)