How do I merge a geographical coordinates with spatial data (shape files) using the sf package?

Dear R Studio community,

I have just started to learn R. This is my very first time to use this platform. I am working on two datasets: the first one contains georeferenced data on conflict events and deaths in DR Congo. I also have shape files for health districts/health zones in DR Congo. I wish to merge the conflict data with the shape files, then I can assess the impact of conflict on some health indictors, at health zone level.

The problem I am facing is that I am getting a intersection (conflict and zones) dataset (named conflict.matching) with zero observations. Could someone help me understand that. Attached is my computer screenshot and below is my R script:

https://www.dropbox.com/s/rqz8kbqw972yuji/Screen%20Shot%202019-07-20%20at%2015.34.02%20Merging%20data.png?dl=0

Importing packages

library(foreign) # pour importer le fichier dbf
library(sf) # pour les manipulations de donnees spatiales
library (tibble)

importing of dbf (it works)

adm3 <- read.dbf("Zone/COD_adm3.dbf") # it works

Importing conflict data

conflict= read.csv2("conflict_csv.csv", sep = ",") # It works

Transform latitue and longitude from factor to numeric format

conflict$latitude=as.numeric(conflict$latitude)
conflict$longitude=as.numeric(conflict$longitude)

Transforming confllict data into spatial data

s

importing shape files for health zones (COD_adm3.shp)

zone <- read_sf(dsn="Zone/COD_adm3.shp") # It works

Check if the Coordinate Reference Sytems are the same

st_crs(conflict.sf) # Coordinate Reference System: NA
st_crs(zone) # Coordinate Reference System EPSG: 4326

Change conflict.sf CRS to the same as for zones (EPSG: 4326)

st_crs(conflict.sf) <- 4326

intersection between conflict and spatial data in order to be able to merge the data

int1 <- sf::st_intersects(conflict.sf,zone)# an sgbp object is created.

lines for which we don't have data in the file/fichier

no.matching <- unlist(lapply(int1, function(x) length(x)==0))

conflicts without lines not existing in the zone.shp

conflict.matching <- conflict[!no.matching,]

add shape file colums to conflict data

conflict.matching$Name <- as.character(zone$Name[unlist(int1)])
conflict.matching$ISO_Ctry <- as.character(zone$ISO_Ctry[unlist(int1)])
conflict.matching$LvlID <- as.character(zone$LvlID[unlist(int1)])
conflict.matching$LvllD <- as.character(zone$LvllD[unlist(int1)])
conflict.matching$PROVNAME <- as.character(zone$PROVNAME[unlist(int1)])
conflict.matching$Name_API <- as.character(zone$Name_API[unlist(int1)])

sapply(conflict.matching[43:48], table, useNA="always")

Hi, and welcome.

The crux of your problem is trying to combine point data (conflict long/lat) with polygon data (health zones). Creating a new combined geometry object would be one approach, but you'd need to take a deeper dive into the sf documentation than is needed for your purposes.

If I understand you correctly, your ultimate unit of analysis is the health zone, and the locations of conflicts within the health zone are not relevant.

For each health zone, extract its bounding box and, for each conflict determine if the conflict lies within the box. If the number of conflicts is relevant, use an incrementing integer in your results vector; otherwise just a logical vector.

Your result will be a data frame or tibble with each health zone, its conflict status and your health data of interest. Keep the geometry field if you foresee doing mapping (which is also the time to worry about CRS).

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.