How to get sf tools to collapse boundaries and only have one label for each new area

I have a geographic sf tibble with a variable for area, and about 40 rows. I want to group or split the tibble according to area, unify each group, and then be able to plot the unified areas' borders, with a text label for each unified area. The desired plot will have just 11 combined, larger areas, with a single text label for each.

So far in my attempts, I can either get:
a) a set of unified areas :heart_eyes:, but with multiple text labels :unamused: where the shape contains multiple polygons (eg islands) - see image below, noting the duplicated labels where there are islands at the coastline:

or b) a set of unduplicated labels :raised_hands:t2: but without successfully resolving the internal boundaries :pensive:

a) comes from code like this:

sf_tibble %>%
  dplyr::group_by(area) %>% 
  dplyr::summarise()

b) comes from:

sf_tibble %>%
  dplyr::group_by(area) %>% 
  dplyr::summarise(across(geometry, ~ sf::st_combine(.)))

If I use st_combine followed by st_union:

sf_tibble %>%
  dplyr::summarise(across(geometry, ~ sf::st_combine(.)), .groups = "keep") %>% 
  dplyr::summarise(across(geometry, ~ sf::st_union(.)), .groups = "drop")

I get the same result as b). It seems like the st_union makes no difference - it's no longer successfully unifying the smaller areas into the larger areas.

If I just use st_union:

sf_tibble %>%
  dplyr::summarise(across(geometry, ~ sf::st_union(.)), .groups = "drop")

I get the same result as a): unified areas but multiple labels.

Can I get what I want? What am I missing?

Oh flip, it's always the way, you spend hours on something, then as soon as you post it on RStudio Community you solve it yourself 5 minutes later.

All I needed to do was reverse the order of st_union and st_combine and it's great. (I thought I'd already tried that, but clearly not.)

sf_tibble %>%
  dplyr::group_by(area) %>% 
  dplyr::summarise(across(geometry, ~ sf::st_union(.)), .groups = "keep") %>%
  dplyr::summarise(across(geometry, ~ sf::st_combine(.)))

does the job :sunglasses:.

1 Like

I have theory that - besides the rubber duck effect - the need of coming up with a reprex makes your mind focus on the truly important part of the problem.

Glad you sorted it out!

If I were in your shoes I would probably just group by the administrative unit (is it county name?) and pipe to a plain summarise() (and hope for the best; it often works just like that - feels like magic).

1 Like

Thanks!
When I just grouped and piped to summarise:

sf_tibble %>%
  dplyr::group_by(area) %>% 
  dplyr::summarise()

.. I got the multiple labels problem (see my example map a) above).
This was the equivalent outcome to explicitly calling st_union as the function within summarise.

I needed to call both st_union and st_combine in order to get the output I wanted.

The fact that I had only been trying them in the "wrong" order indicates that I don't fully understand the nature of these functions — in my mental model, combine was "weaker" than union, and it would be pointless to use it having already done a union. But I was wrong!

Oki, it is hard to make concrete conclusions without access to your data.
But you have sorted it out, and that is what matters :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.