countrycode:: merge by NUTS2 name

Hi,

My issue is the following: I have a dataset, which contains some variables describing NUTS 2 data (regional location), which I would like to merge with the sf data from eurostat package (goal is to create a tidy dataset with this dataset and some other variables from eurostat). I enclose a minimal reprex which shows you the attributes.
You may see that the problem emergies, that the language coding is not same in the 2 tables (merge would be possible by NUTS_NAME, but the accents differ), and in dat you may find GDLCODE which I cant coerce to NUTS_ID .
Do you have any ideas about how I could merge them?

library(tidyverse)
setwd("C:/school/szem_8/TDK-fertility/fertilityEU")


dat <- read_csv("GDL-Sub-national-HDI-data.csv")
#> 
#> -- Column specification --------------------------------------------------------
#> cols(
#>   .default = col_double(),
#>   Country = col_character(),
#>   ISO_Code = col_character(),
#>   Level = col_character(),
#>   GDLCODE = col_character(),
#>   Region = col_character()
#> )
#> i Use `spec()` for the full column specifications.
# source: https://globaldatalab.org/shdi/

dat %>%
  filter(ISO_Code == "HUN") %>% 
  select(1:6) %>% 
  rename(NUTS_NAME = Region) %>% 
  glimpse
#> Rows: 8
#> Columns: 6
#> $ Country   <chr> "Hungary", "Hungary", "Hungary", "Hungary", "Hungary", "H...
#> $ ISO_Code  <chr> "HUN", "HUN", "HUN", "HUN", "HUN", "HUN", "HUN", "HUN"
#> $ Level     <chr> "National", "Subnat", "Subnat", "Subnat", "Subnat", "Subn...
#> $ GDLCODE   <chr> "HUNt", "HUNr107", "HUNr104", "HUNr106", "HUNr105", "HUNr...
#> $ NUTS_NAME <chr> "Total", "Del-Alfold", "Del-Dunantul", "Eszak-Alfold", "E...
#> $ `1990`    <dbl> 0.704, 0.679, 0.678, 0.673, 0.666, 0.690, 0.751, 0.705

eurostat::get_eurostat_geospatial(nuts_level = 2) %>% 
  data.frame() %>% 
  filter(CNTR_CODE == "HU") %>% 
  glimpse
#> sf at resolution 1:60 read from local file
#> Rows: 8
#> Columns: 8
#> $ id        <chr> "HU11", "HU12", "HU21", "HU22", "HU23", "HU31", "HU32", "...
#> $ CNTR_CODE <chr> "HU", "HU", "HU", "HU", "HU", "HU", "HU", "HU"
#> $ NUTS_NAME <chr> "Budapest", "Pest", "Közép-Dunántúl", "Nyugat-Dunántúl", ...
#> $ LEVL_CODE <int> 2, 2, 2, 2, 2, 2, 2, 2
#> $ FID       <chr> "HU11", "HU12", "HU21", "HU22", "HU23", "HU31", "HU32", "...
#> $ NUTS_ID   <chr> "HU11", "HU12", "HU21", "HU22", "HU23", "HU31", "HU32", "...
#> $ geometry  <MULTIPOLYGON [arc_degree]> MULTIPOLYGON (((18.93274 47..., MUL...
#> $ geo       <chr> "HU11", "HU12", "HU21", "HU22", "HU23", "HU31", "HU32", "...

Created on 2021-02-10 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.