Plotting shapefiles with attributes using ggplot

Dear All,

I am trying to plot a shapfile (river network or streamlines) by filling the color from values in attribute table (e.g., Duration).
the head of the shapefile is

   readOGR("Duration.shp")
   OGR data source with driver: ESRI Shapefile 
    Source: "c:\Duration.shp", layer: "Duration"
     with 10291 features
      It has 15 fields
      Integer64 fields read as strings:  COMID_1 

head(shapefile)
       long      lat order piece id group      maxDef    Severity Duration   Magnitude
1 -2.979167 14.76667     1     1  0   0.1 0.005086238 4.447949231     3426 0.001203216
2 -2.979167 14.75417     2     1  0   0.1 0.005086238 4.447949231     3426 0.001203216
3 -2.900000 14.82500     1     1  1   1.1 0.025923798 20.47003689     3398 0.006718117
4 -2.937500 14.78750     2     1  1   1.1 0.025923798 20.47003689     3398 0.006718117
5 -2.945833 14.78750     3     1  1   1.1 0.025923798 20.47003689     3398 0.006718117
6 -2.945833 14.77083     4     1  1   1.1 0.025923798 20.47003689     3398 0.006718117

I tested the following using geom_polygon, line, and path but I didnt work.

ggplot() + **geom_polygon**(data = shp_df, aes(x = long, y = lat, group = group, fill =Duration)

Best regards,

Solomon

Hi Solomon,

Welcome.

I recommend trying the sf package, as in my opinion it simplifies a lot of the handling of maps.

For future reference, a reproducible example to work with would help to be certain, but try something like this:

library(ggplot2)
library(sf)

shp <- read_sf('Duration.shp')
ggplot(shp) + geom_sf(aes(colour = Duration))

or, if the rivers are polygons rather than lines:

ggplot(shp) + geom_sf(aes(fill = Duration))

Ron.

1 Like

Hi Ron,

Thanks for the tips. the rivers are lines and for every line there is avalue in the attribute table (e.g. Duration). i tried

shp <- read_sf('Duration.shp')
ggplot(shp) + geom_sf(aes(colour = Duration))

but it is not producing a map rather the values see image attached.

This is where a reproducible example would be helpful.

It looks like the values in the Duration column are being treated as a factor (discrete) rather than as (continuous) numeric values.

Questions:

  1. do you get a map if you just use ggplot(shp) + geom_sf()?
  2. what type of data is in the column Duration? (type shp and return, you should get an indication of the data type below the column name).

Ron.

Dear Ron,

Yes, I think the values were treated as a factor. I changed the values to numeric and it is working.

Thanks,

Solomon

Dear Ron,
the values in the file ( Duration) ranges from 0-3001 and i want to classfy them into 5 using somthing like breaks,
breaks=c(0, 2776, 3061, 3272, 3558, 4370)

Is there a way to add this in the following codes,

shp <- read_sf('Duration.shp')
ggplot(shp) + geom_sf(aes(colour = Duration))

Best regards,

Solomon

1 Like

Sorry not to have replied before, I've been a bit busy.

You could look at ?cut as that will split numbers into categories as you want.

However, you say the values go from 0-3001, but your screenshot from before showed values above 4000 and you are now saying you want a top category above 4000 too. I think you may not have converted from factor to numeric correctly. Can you show the code you've used?

I'd take a guess but I'm on a bus and would probably mess it up.

Ron.

1 Like

Right. No longer on a bus. So here's a guess which might still be messed up:

library(tidyverse)
library(sf)
shp <- read_sf('Duration.shp') %>%
              # Convert to numeric via character
              # (ie label of factor Duration converted to numeric,
              # I suspect that you did as.numeric(Duration) which would have
              # converted the level of Duration to a number)
       mutate(Duration = as.numeric(as.character(Duration)),
              # convert numberic values into a small number of caegories
              fDuration = cut(Duration, breaks = c(0, 2776, 3061, 3272, 3558, 4370)))
ggplot(shp) + geom_sf(aes(colour = fDuration))

nb, I haven't run this code. You should check the outputs to make sure they make sense and correspond to what you expect. Again, check out ?cut to understand what it does and what it can do if your requirements are different from what I have done.

And a reprex really would be helpful, it would help me and others to help you and creating one might help you to find and possibly fix the problems for yourself.

Ron.

Dear Ron,

Thanks for the help and the code. I found the cut option and I used a quantile method to class the data and compare the results from other method. Now, I have 3 Durations from 3 methods combined in shp

duration1= 0-6642

duration2=0-2089

duration3=0-4367

This is what I did

 a<-cut(shp$Duration1, breaks=data.frame(classIntervals(shp$Duration1, n=5,method="quantile")[2])[,1],include.lowest=T)

 f1= ggplot(shp) + geom_sf(aes(colour=a))

b<-cut(shp$Duration2, breaks=data.frame(classIntervals(shp$Duration2, n=5,method="quantile")[2])[,1],include.lowest=T)

f2= ggplot(shp) + geom_sf(aes(colour=b))

c<-cut(shp$Duration3, breaks=data.frame(classIntervals(shp$Duration3, n=5,method="quantile")[2])[,1],include.lowest=T)

f2= ggplot(shp) + geom_sf(aes(colour=c))

Now, I need to create a common legend and plot in one page. i tried grid_arrenge_shared_legend

grid_arrange_shared_legend(f1, f2, f3, ncol = 3, nrow = 1, position = "bottom")

But this is not creating a common legend, only display for the first plot (f1). any idea? thanks again for your help and time.

Best regards,

Solomon

I don't have data to play with, but if I did I'd be looking at using gather to make the make a long version of the data with Duration1, Duration2 and Duration3 (probably having used st_drop_geometry to drop the coordinate info) into a single column, then cut to categorise that whole column in some way before left_join(shp, .) back to shp ready to plot. Then facet_wrap to split into three graphs with one consistent legend when ggplotting.

I'm not clear on whether the quantiles should be aligned (eg same value defining the breaks for each Duration) or the quantiles should be aligned (eg this thing is in quantile A for Duration1, quantile B for Duration2 etc). This depends on what you want to show and talk about.

A reprex would make this easier.

Ron.