Using 2 shapefiles to match data

The Koppen Climate Classification is basically a shapefile that says what is the nature of the climate in a certain grid on the map in the world (e.g. for latitude A and longitude B, the climate is tropical). This can be downloaded from here: Data Catalog (for years 1976-2000).

I also have a dataset dividing Nigeria into 774 regions, called LGAs. This can be downloaded here: Shape file of Nigeria - Shape file for Nigeria - openAFRICA

Each of the 774 LGAs in Nigeria will have many grid points from Koppen. Therefore, every LGA may have one type of climate (if all Koppen grid points in the LGA have the same climate), or more (if different climate grid points in the LGA have different climates).

I need help coming up with a dataset that says for each of the LGAs what the climate is. If an LGA has more than one climate, then choose the climate with the most grid points within the LGA.

Any help is appreciated. Thank you!

Hi @gioriogio,
I made an attempt to answer your question - turned out to be quite a deep rabbit hole!
The shape files for LGAs and Climate regions (GEOCODES) define polygons, so I (think) I have worked out how to see which ones intersect:

library(tidyverse)
library(sf)

# I downloaded the ZIP files and extracted the shape files.
path <- "C:/Users/xxxxx/Downloads/c1976_2000_0/c1976_2000.shp"

data <- read_sf(path)
plot(data)

climate <- as.data.frame(data)
str(climate)

levels(as.factor(climate$GRIDCODE)) # 31 climate regions globally

path2 <- "C:/Users/xxxxx/Downloads/nigeria-lgas/new_lga_nigeria_2003.shp"
data2 <- read_sf(path2)
plot(data2)

lga <- as.data.frame(data2)
str(lga)

st_crs(data)   # Need to re-project this data into WGS84 to match data2
st_crs(data2)

old_crs <- st_crs(data)
desired_crs <- st_crs(data2)

data_1 <- st_transform(data, crs=desired_crs)
st_crs(data_1)

sf_use_s2(FALSE)
data_joined <- st_join(data2, data_1, .predicate=st_intersects)

data_joined %>% 
  group_by(LGA, GRIDCODE) %>% 
  summarise(freq = n()) %>%
  as.data.frame(.) %>% 
  select(-geometry) -> summary_data

summary_data

# Some LGAs intersect with 2 GRIDCODES but most only with 1.
with(summary_data, table(LGA, GRIDCODE))

addmargins(with(summary_data, table(LGA, GRIDCODE)))

# Only 4 climate regions are present in Nigeria - is this correct?
levels(as.factor(data_joined$GRIDCODE))

Disclaimer: you need to check everything this code does!

2 Likes

Thank you so much for this. It makes perfect sense

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.