Add label to a tibble data using a stata database

Hello,
I'm appending some tables I created using tidyverse.
I'm using bind_rows because the number of columns isn't the same.
However, when I try to recode values from NA to "Total", I can't make It.
I'm using a stata file with labels, so I want to add just one level at the moment I bind the tables.

library(haven)
library(tidyverse)
yyz=read_dta("http://www.stata-press.com/data/r8/auto.dta")


tab1=yyz %>% group_by(foreign,headroom) %>% summarise(price=mean(price,na.rm=T))
tab2=yyz %>% group_by(headroom) %>% summarise(price=mean(price,na.rm=T))

output=bind_rows(tab1,tab2) %>% print(n=100)

output %>% mutate(foreign=fct_explicit_na(foreign,na_level = "Total")) %>% print(n=100)

This is what I try to reproduce


    foreign headroom price
       <dbl+lbl>    <dbl> <dbl>
 1  0 [Domestic]      1.5 5270.
 2  0 [Domestic]      2   4554.
 3  0 [Domestic]      2.5 6354.
 4  0 [Domestic]      3   6222.
 5  0 [Domestic]      3.5 7441.
 6  0 [Domestic]      4   6458.
 7  0 [Domestic]      4.5 5018 
 8  0 [Domestic]      5   4060 
 9  1 [Foreign]       1.5 6229 
10  1 [Foreign]       2   5719.
11  1 [Foreign]       2.5 6686.
12  1 [Foreign]       3   5538.
13  1 [Foreign]       3.5 8492.
14  3 [Total]         1.5 5510.
15  3 [Total]         2   4823.
16  3 [Total]         2.5 6592.
17  3 [Total]         3   5906.
18  3 [Total]         3.5 7581.
19  3 [Total]         4   6458.
20  3 [Total]         4.5 5018 
21  3 [Total]         5   4060 


But the if_else from dplyr doesn't coerce data types(foreign), so It gives an error.
IFELSE coerce the foreign, but transform the dbl+lbl into dlb losing all the labels.
The fct_explicit_na shows:



output %>% mutate(foreign=fct_explicit_na(foreign,na_level = "Total")) %>% print(n=100)
Error: Problem with `mutate()` column `foreign`.
i `foreign = fct_explicit_na(foreign, na_level = "Total")`.
x `f` must be a factor (or character vector).
i The error occurred in group 1: foreign = 0.
Run `rlang::last_error()` to see where the error occurred.

Is it possible what I'm trying to do?
Thanks for your time and interest, folks
Have a nice day

I am not sure what you want to achieve. Maybe you get some ideas from this code:

library(haven)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
yyz=read_dta("http://www.stata-press.com/data/r8/auto.dta")

#class(yyz)
#attributes(yyz)
#attributes(yyz$make)

tab1=yyz %>% group_by(foreign,headroom) %>% 
  summarise(price=mean(price,na.rm=T)) %>% ungroup()
#> `summarise()` has grouped output by 'foreign'. You can override using the `.groups` argument.
tab2=yyz %>% group_by(headroom) %>% summarise(price=mean(price,na.rm=T))

labs <- attr(tab1$foreign,"labels" )
labs_df <- tibble::tibble(foreign= labs,foreign_c=names(labs))

attributes(tab1$foreign) <- NULL

output <- bind_rows( tab1 ,tab2 ) %>% 
  left_join(labs_df,by=c(foreign='foreign'),copy=T) %>%
  mutate(foreign = if_else(is.na(foreign),'Total',foreign_c)) %>%
  select(-foreign_c) %>%
  print()
#> # A tibble: 21 x 3
#>    foreign  headroom price
#>    <chr>       <dbl> <dbl>
#>  1 Domestic      1.5 5270.
#>  2 Domestic      2   4554.
#>  3 Domestic      2.5 6354.
#>  4 Domestic      3   6222.
#>  5 Domestic      3.5 7441.
#>  6 Domestic      4   6458.
#>  7 Domestic      4.5 5018 
#>  8 Domestic      5   4060 
#>  9 Foreign       1.5 6229 
#> 10 Foreign       2   5719.
#> # ... with 11 more rows
Created on 2021-07-29 by the reprex package (v2.0.0)

Thanks, but I think your code only transform foreign to character.
It's not adding one level to the factors using the loaded from the stata file.

Hello @jfca283,

you are quite right, but besides that it also not doing a lot of other stuff :grinning:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.