County-level plot has gap between the States

Hi everyone,

I am trying to plot the geographical distribution of wheat production using a county-level data set.

In order to create the data set, I combined a data set of county-level fips code with unique latitude-longitude for each fips code, with a data set of county-level wheat production.

I do not know why, it shows gaps between States, instead of having them connected (figure below)
wheat_production

library(maps)
g43 <- wheat_corn_insurance1 %>% filter(year==1989) %>% filter(wheat==1)%>%
        ggplot(aes(x = longitude, y = latitude, group = state, fill = stateproduction/1000)) + 
        geom_polygon(color = NA) + 
        scale_fill_gradient(low = "white", high = "red") + 
        labs(title = "County-Level Choropleth Map", 
        subtitle = "Wheat Production (thousand bu), 1989")

g43

I tried using the urbnmapr package as well, but I received a similar result.

Can someone help me with this?

Does every county grow wheat?

No it does not. In the ggplot command above, I have grouped data by State, and production is also at the State-level.

Well, that’s what it would look like if, for example, Elko County, NV is included. NA ≠ 0, is a possible reason. In fact, for spring wheat, for example, only a handful of counties had more than 5,000 acres of spring wheat planted in 2022.

3 Likes

Yes, thats true. But the ggplot command groups data at the state level, and also asks for plotting the production at the state level. With this set-up, I should not still expect to see well-connected states?

1 Like

Oh, I see now. After enlarging the map. No borders, either between counties or at the state level.

This can be fixed most simply with the {sf} package, which uses a data frame with a column that handles spatial geometry. There’s a bit of a lift to install one of the system library dependencies, but you get very fine grained control.

1 Like

Thank you for your guidance. Can you introduce a link/reference to the solution you suggested? I have worked with sf package for a project before.

Here's an example. I set the estimate variable being used as fill to zero to emulate your situation. It does not knock out the states, although setting them to NA does.

library(ggplot2)
library(tidycensus)
library(tigris)
us_median_age <- get_acs(
  geography = "state",
  variables = "B01002_001",
  year = 2019,
  survey = "acs1",
  geometry = TRUE,
  resolution = "20m"
) |>
  shift_geometry()

ggplot(us_median_age,aes(fill = estimate)) +
  geom_sf() +
  scale_fill_viridis_c(alpha = 0.50, option="cividis")

# knock out some states

altered <- us_median_age

knocks <- altered$estimate > 38.3 | altered$estimate < 37.0

altered[knocks,"estimate"] <- 0

ggplot(altered,aes(fill = estimate)) +
  geom_sf() +
  scale_fill_viridis_c(alpha = 0.50, option="cividis")

image

1 Like

Thank you so much for your help.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.