Generate epidemiological risk maps for Random Forest

The sf package can read shapefiles and provides an object that works like a data frame to which other variables can be added. In your case, that would be a variable for neighborhood name, one for the estimated probability and, potentially, a categorical variable, such as yes/no, low/high, etc. ggplot can produce a map, called a thematic or chloropleth map, to illustrate the data.

See 5. Plotting Simple Features in the sf vignette for examples. My somewhat outdated post may help with some of the ggplot details.

The two problems that you are most likely to face are

  1. Installing the external library dependencies for sf
  2. Understanding how ggplot treats continuous and discrete scales differently

See the FAQ: How tod a minimal reproducible example reprex for beginners and come back with a representative sf object that illustrates any plotting problems you run into.

1 Like