How to work on dataframes with different lengths

rstudio

#1

I have two datasets for example, which have different lengths, how to get reductions in the common time period? I give the sample data below to explain what I meant.

DF1
year value
1981 350
1982 910
1983 500
1984 312
1986 460
1987 510

DF2
year value
1983 311
1984 550
1985 270
1986 480
1987 499
1988 560
1989 570
1990 601

I want to have a new dataframe, which shows the difference between the two in common years. I used DF1$value - DF2$value, but it says that "ā€˜-ā€™ only defined for equally-sized data frames". Also, in DF1, the years are not consecutive, 1984, 1986, etc. In the new dataframe, it could show NA value or dismiss the year 1985 but have values for the other common years. And the new dataframe should have a column named "year" as well. How to do this? Thanks for your help.


#2

Hi, what you want to do is done with joins. Here is an example that should put you on a right track:

suppressPackageStartupMessages(library(tidyverse))

tibble1 <- tibble(year = as.character(1981:1989), value = rnorm(9))
tibble2 <- tibble(year = sample(as.character(1981:1989), size = 6), value = rnorm(6))

tibble1
#> # A tibble: 9 x 2
#>   year    value
#>   <chr>   <dbl>
#> 1 1981  -0.580 
#> 2 1982  -0.0376
#> 3 1983   0.913 
#> 4 1984  -0.737 
#> 5 1985  -0.387 
#> 6 1986   0.0912
#> 7 1987   3.76  
#> 8 1988   1.10  
#> 9 1989  -0.940
tibble2
#> # A tibble: 6 x 2
#>   year   value
#>   <chr>  <dbl>
#> 1 1987   0.874
#> 2 1989  -0.282
#> 3 1984  -0.100
#> 4 1982   1.01 
#> 5 1983   0.672
#> 6 1986   1.34

tibble1$value - tibble2$value
#> Warning in tibble1$value - tibble2$value: longer object length is not a
#> multiple of shorter object length
#> [1] -1.4531519  0.2439863  1.0131162 -1.7449109 -1.0591744 -1.2475435
#> [7]  2.8834482  1.3785701 -0.8390908

dplyr::full_join(tibble1, tibble2, by = "year") %>%
  dplyr::mutate(diff = value.x - value.y)
#> # A tibble: 9 x 4
#>   year  value.x value.y    diff
#>   <chr>   <dbl>   <dbl>   <dbl>
#> 1 1981  -0.580   NA      NA    
#> 2 1982  -0.0376   1.01   -1.05 
#> 3 1983   0.913    0.672   0.241
#> 4 1984  -0.737   -0.100  -0.637
#> 5 1985  -0.387   NA      NA    
#> 6 1986   0.0912   1.34   -1.25 
#> 7 1987   3.76     0.874   2.88 
#> 8 1988   1.10    NA      NA    
#> 9 1989  -0.940   -0.282  -0.658

Created on 2018-10-23 by the reprex package (v0.2.1)

In the future, it is always helpful to try to put your question in a reprex (similar to what I did above). There is more info about how to do it here: