Close enough. I'll use the names as shown in your screenshot for the tibbles.
I'll demonstrate the syntax by performing a left_join()
. You can swap it for inner_join()
or any other join depending on your requirements.
library(dplyr, warn.conflicts = FALSE)
hb_lookup_2020 <- tribble(
~HB, ~HBName, ~HBDateEnacted, ~HBDateArchived, ~Country,
"S08000015", "NHS Ayrshire and Arran", 20140401, NA, "S92000003",
"S08000016", "NHS Borders", 20140401, NA, "S92000003",
"S08000017", "NHS Dumfries and Galloway", 20140401, NA, "S92000003",
"S08000018", "NHS Fife", 20140401, 20180201, "S92000003",
"S08000019", "NHS Forth Valley", 20140401, NA, "S92000003",
"S08000020", "NHS Grampian", 20140401, NA, "S92000003"
)
ui_deaths_2020 <- structure(list(
Year = c(2007, 2007, 2007, 2007, 2007, 2007),
HBR = c("S92000003", "S92000003", "S92000003", "S92000003", "S92000003", "S92000003"),
HBRQF = c("d", "d", "d", "d", "d", "d"),
CA = c("S92000003", "S92000003", "S92000003", "S92000003", "S92000003", "S92000003"),
CAQF = c("d", "d", "d", "d", "d", "d"),
AgeGroup = c("All", "All", "All", "All", "All", "All"),
AgeGroupQF = c("d", "d", "d", "d", "d", "d"),
Sex = c("All", "All", "All", "All", "All", "All"),
SexQF = c("d", "d", "d", "d", "d", "d"), InjuryLocation = c("All", "All", "All", "All", "All", "All"),
InjuryLocationQF = c("d", "d", "d", "d", "d", "d"),
InjuryType = c("All", "Land transport accidents", "Poisoning", "Falls", "Struck by, against", "Crushing"),
InjuryTypeQF = c("d", NA, NA, NA, NA, NA),
NumberOfDeaths = c(1289, 294, 63, 658, 5, 1)
),
row.names = c(NA, -6L),
class = c("tbl_df", "tbl", "data.frame")
)
hb_lookup_2020 %>%
left_join(ui_deaths_2020, by = c("HB" = "HBR", "Country" = "CA"))
#> # A tibble: 6 x 17
#> HB HBName HBDateEnacted HBDateArchived Country Year HBRQF CAQF AgeGroup
#> <chr> <chr> <dbl> <dbl> <chr> <dbl> <chr> <chr> <chr>
#> 1 S080~ NHS A~ 20140401 NA S92000~ NA <NA> <NA> <NA>
#> 2 S080~ NHS B~ 20140401 NA S92000~ NA <NA> <NA> <NA>
#> 3 S080~ NHS D~ 20140401 NA S92000~ NA <NA> <NA> <NA>
#> 4 S080~ NHS F~ 20140401 20180201 S92000~ NA <NA> <NA> <NA>
#> 5 S080~ NHS F~ 20140401 NA S92000~ NA <NA> <NA> <NA>
#> 6 S080~ NHS G~ 20140401 NA S92000~ NA <NA> <NA> <NA>
#> # ... with 8 more variables: AgeGroupQF <chr>, Sex <chr>, SexQF <chr>,
#> # InjuryLocation <chr>, InjuryLocationQF <chr>, InjuryType <chr>,
#> # InjuryTypeQF <chr>, NumberOfDeaths <dbl>
Created on 2020-08-07 by the reprex package (v0.3.0)
Note: When you perform a join only the key columns from the left table are retained by default i.e. HB
and Country
from hb_lookup_2020
in this case (which I believe is what you want).
You can follow this up with a rename()
statement if you wish to change the names. For example, if you prefer the names of the keys in the right table, you could do the following.
hb_lookup_2020 %>%
left_join(ui_deaths_2020, by = c("HB" = "HBR", "Country" = "CA")) %>%
rename(HBR = HB, CA = Country)
# A tibble: 6 x 17
HBR HBName HBDateEnacted HBDateArchived CA Year HBRQF CAQF AgeGroup AgeGroupQF Sex SexQF InjuryLocation InjuryLocationQF
<chr> <chr> <dbl> <dbl> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 S080~ NHS A~ 20140401 NA S920~ NA NA NA NA NA NA NA NA NA
2 S080~ NHS B~ 20140401 NA S920~ NA NA NA NA NA NA NA NA NA
3 S080~ NHS D~ 20140401 NA S920~ NA NA NA NA NA NA NA NA NA
4 S080~ NHS F~ 20140401 20180201 S920~ NA NA NA NA NA NA NA NA NA
5 S080~ NHS F~ 20140401 NA S920~ NA NA NA NA NA NA NA NA NA
6 S080~ NHS G~ 20140401 NA S920~ NA NA NA NA NA NA NA NA NA
# ... with 3 more variables: InjuryType <chr>, InjuryTypeQF <chr>, NumberOfDeaths <dbl>
Hope that helps.