Generate epidemiological risk maps for Random Forest

Hi Guys, I'm working with classifiers that in their predictions I get the probability of a certain disease occurring in a certain place. The classifier used is Random Forest and I get the following probabilities in the following format data frame

Yes        No 
0.98      0.02
0.2         0.98
0.80       0.10
0.50       0.50

Yes = have the disease; No = does not have the disease

Based on these probabilities I want to apply these results to each designated location on the map. Each probability is associated with a number designates the name of the place, for example

neighborhood 1 = 1
neighborhood 2 = 2
and so on.

In this case, for each neighborhood a probability is associated and I would like to count it on the map. I have shapefile file from location. Any ideas on how I can do this? I'm working on R Markdown and I'm new to the language. Any help is welcome and thanks in advance!

The sf package can read shapefiles and provides an object that works like a data frame to which other variables can be added. In your case, that would be a variable for neighborhood name, one for the estimated probability and, potentially, a categorical variable, such as yes/no, low/high, etc. ggplot can produce a map, called a thematic or chloropleth map, to illustrate the data.

See 5. Plotting Simple Features in the sf vignette for examples. My somewhat outdated post may help with some of the ggplot details.

The two problems that you are most likely to face are

  1. Installing the external library dependencies for sf
  2. Understanding how ggplot treats continuous and discrete scales differently

See the FAQ: How tod a minimal reproducible example reprex for beginners and come back with a representative sf object that illustrates any plotting problems you run into.

1 Like

Thank you very much for the answer and I will put it as a solution to my question.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.