R data frame relevant columns operations


Hi,
If I have a data frame like these, and I want to use ECredit-E, ICredit-I, what's the best way to do so?
(I know i can mutate them one by one)
Thank you.

You seem to already know this, but yes mutate() could be your friend here

What do you want to do exactly that this does not cover ?

1 Like

Instead of this mutate(E=E-E0,I=I-I0,O=O-O0,T=T-T0,W=W-W0)
Do we have a more efficient way to do that?
I was thinking about mutate_at(vars(contains("0")),~.-vars(contains(substr(.,1,1)))
I know it's wrong, cuz the substr(.,1,1) should operate at the column name but the dot refers to the value.

You'd be better off if you got your data into a tidy format first. Here's an approach using just the first two rows of your dataset and the E and I variable pairs.

library(tidyverse)

tbl <- tribble(~Year, ~E, ~I, ~ECredit, ~ICredit,
              2020, 350433449, 55809325, 378241084, 56983482,
              2021, 338746095, 53934351, 366116650, 55173825)

print(tbl)
#> # A tibble: 2 x 5
#>    Year         E        I   ECredit  ICredit
#>   <dbl>     <dbl>    <dbl>     <dbl>    <dbl>
#> 1  2020 350433449 55809325 378241084 56983482
#> 2  2021 338746095 53934351 366116650 55173825

Tidying the data to make difference calculation easier:

output_long <- tbl %>% 
  pivot_longer(-Year) %>% 
  separate(name, into = c("Prefix", "Suffix"), sep = 1) %>% 
  # add a suffix so that the pivoted column name is not NULL
  mutate(Suffix = if_else(Suffix == "", "Original", Suffix)) %>% 
  pivot_wider(names_from = Suffix, values_from = value) %>% 
  mutate(Difference = Credit - Original) %>% 
  
print(output_long)
#> # A tibble: 4 x 5
#>    Year Prefix  Original    Credit Difference
#>   <dbl> <chr>      <dbl>     <dbl>      <dbl>
#> 1  2020 E      350433449 378241084   27807635
#> 2  2020 I       55809325  56983482    1174157
#> 3  2021 E      338746095 366116650   27370555
#> 4  2021 I       53934351  55173825    1239474

Now to get the original structure back:

output_wide <- output_long %>% 
  pivot_longer(c(-Year, -Prefix)) %>% 
  unite("var", Prefix, name, sep = "") %>% 
  # remove the previously added suffix
  mutate(var = str_replace(var, "Original", "")) %>% 
  pivot_wider(names_from = var)

print(output_wide)
#> # A tibble: 2 x 7
#>    Year         E   ECredit EDifference        I  ICredit IDifference
#>   <dbl>     <dbl>     <dbl>       <dbl>    <dbl>    <dbl>       <dbl>
#> 1  2020 350433449 378241084    27807635 55809325 56983482     1174157
#> 2  2021 338746095 366116650    27370555 53934351 55173825     1239474

Needless to say, this approach only makes sense if you have lots of pairs of variables. If it's just a handful, multiple mutate statements will likely be much easier.

Created on 2020-02-24 by the reprex package (v0.3.0)

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.