Incorrect number of dimensions when using a for loop with ggplot2

I'm trying to map out all these different "solutions" from the prioritizr model using ggplot. (To preface I'm very new to R so perhaps my code/methods are extremely rudimentary and why I'm also not able to decipher the errors completely). What I was doing before was copying and pasting the code to make a ggplot map for each solution (named s1, s2, s3 etc..). It was very time-consuming - (I think due to the size of the datasets I'm loading). I don't mind if it takes even longer to display all 10, but id rather automate it and let it run all night (like in a for loop or function) as opposed to hanging around my computer all day and ensuring that the plot is done and saved, and then having to click ctrl + enter to run the next plot.

Option 1: Putting the solutions into a list. and I'm running through the list in a for loop to create the map:

solutions <- list("s1" = s1, "s2" = s2, "s3" = s3, "s4" = s4, "s5" = s5, "s10" = s10, "s11" = s11, "s12" = s12)

for (s in 1:length(solutions)) {
  ggplot() + 
    geom_sf(data = range, size = 1, color = "black", fill = "NA") +
    geom_sf(data = fire, size = 0.25, color = "darkred",
            fill = "darkred", show.legend = 'poly') +
    geom_sf(data = poly, size = 0.25, fill = "darkorange", color = "darkorange") +
    geom_sf(data = IFL, size = 0.25, fill = 'darkseagreen', alpha = 0.7, color = "darkseagreen") +
    geom_sf(data = protected_areas, size = 0.5, alpha = 0.4, fill = "forestgreen", color = "forestgreen") +
    geom_sf(data = s[, "solution_1"], aes(fill = solution_1)) + # this is the data from the solution in Prioritizr. It essentially adds a column called solution_1 and if a planning unit is selected for restoration it will be populated with a value of 1. 
    geom_sf(data = linear, size = 0.5, color = "darkgoldenrod1") +
    ggtitle('Test solution') + 
    scale_fill_gradient(low = NA, high = "blue") + 
    theme_void() + 
  sys.sleep(2)
} 

Option 2 join the columns with cbind() and turn the whole thing around with t(). From a solution I found here.

solutions <- t(cbind("s1" = s1, "s2" = s2, "s3" = s3, "s4" = s4, "s5" = s5, "s10" = s10, 
                     "s11" = s11, "s12" = s12))
for( s in 1:length(solutions)) {
    ggplot() + 
      geom_sf(data = range, size = 1, color = "black", fill = "NA") +
      geom_sf(data = fire, size = 0.25, color = "darkred",
              fill = "darkred", show.legend = 'poly') +
      geom_sf(data = poly, size = 0.25, fill = "darkorange", color = "darkorange") +
      geom_sf(data = IFL, size = 0.25, fill = 'darkseagreen', alpha = 0.7, color = "darkseagreen") +
      geom_sf(data = protected_areas, size = 0.5, alpha = 0.4, fill = "forestgreen", color = "forestgreen") +
      geom_sf(data = s[, "solution_1"], aes(fill = solution_1)) +
      geom_sf(data = linear, size = 0.5, color = "darkgoldenrod1") +
      ggtitle('Test solution') + 
      scale_fill_gradient(low = NA, high = "blue") + 
      theme_void() + 
      ggsave(s, path = path)
    
  Sys.sleep(2)
}

Option 3 : Using lapply() because it looks like it's more efficient?

make_map <- function(s){
  ggplot() + 
    geom_sf(data = range, size = 1, color = "black", fill = "NA") +
    geom_sf(data = fire, size = 0.25, color = "darkred",
            fill = "darkred", show.legend = 'poly') +
    geom_sf(data = poly, size = 0.25, fill = "darkorange", color = "darkorange") +
    geom_sf(data = IFL, size = 0.25, fill = 'darkseagreen', alpha = 0.7, color = "darkseagreen") +
    geom_sf(data = protected_areas, size = 0.5, alpha = 0.4, fill = "forestgreen", color = "forestgreen") +
    geom_sf(data = s[, "solution_1"], aes(fill = solution_1)) +
    geom_sf(data = linear, size = 0.5, color = "darkgoldenrod1") +
    ggtitle('Test solution') + 
    scale_fill_gradient(low = NA, high = "blue") + 
    theme_void() + 
    ggsave(s, path = path)
}

lapply(solutions, make_map)

Essentially, in all the different options I've laid out below, I get the same error:

"Error in s[, "solution_1"] : incorrect number of dimensions" 

However, if I plot each solution (i.e.

geom_sf(data = s1[, "solution_1"], aes(fill = solution_1)) + # this is the data from the solution in Prioritizr. It essentially adds a column called solution_1 and if a planning unit is selected for restoration it will be populated with a value of 1. 

I don't get the error. Essentially, "solution_1" is a column in each dataframe that contains whether the unit has been selected or not by the model.

Any suggestions as to how I can rectify this?

Adding what one of the dataframes that I'm trying to plot in my loop looks like.
S3 example df

Thank you in advance for your help!

The second one works, because you previously created a data.frame called s1 that you can use to plot. For the for one, you are trying to take a column from a data.frame called s.

Now, let's look at the first option:

solutions <- list("s1" = s1, "s2" = s2, "s3" = s3, "s4" = s4, "s5" = s5, "s10" = s10, "s11" = s11, "s12" = s12)

for (s in 1:length(solutions)) {
  print(s)
}
#> [1] 1
#> [1] 2
#> [1] 3
#> [1] 4
#> [1] 5
#> [1] 6
#> [1] 7
#> [1] 8
#> [1] 9
#> [1] 10
#> [1] 11
#> [1] 12

What's happening here? length(solutions) is a number, specifically it's 12. So 1:length(solutions) is 1:12, that means the vector 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. And in your for loop, the code will be run 12 times, first time with s == 1, second with s == 2, ... And when you take s[, "solution_1"] you are asking R to extract a column from the number 1, of course that's not possible and R yells at you.

Option 2 shares that problem: s is still defined as a single number, from which you're trying to extract a column. Option 3 could actually work, if solutions is indeed a list. But if you used the solutions defined in option 2, you will run into problems: you are making 1 data frame out of several, that would only work if you make sure that the column names are appropriately named, for now that's more complicated.

So, what could be a good solution? Let's stay with option 1: you make a list of data.frames:

solutions <- list(s1, s2, s3, s4,s5,s6,s7,s8,s9,s10,s11,12)

(giving names to each entry is correct, not necessary here)

Then, you make a for loop, but the variable in the loop has to be an entire dataframe, not just a number:

for(s in solutions){
  ...
}

Alternatively, you could have s be a number, and extract an entry from the list:

for(s in 1:length(solutions)){
   single_df <- solutions[[s]]
   ...
}

But that's more writing. (but see below why it's better here)

Finally, an important note: ggsave() is perfectly correct, but you should NOT link it with a + to the plot definition. And in option 1, when you have:

this is wrong: it would mean that Sys.sleep() is defining a component of the plot. You don't want a + here!

Also, there isn't really any point in using Sys.sleep(), just let your computer go as fast as it can.

One last thing: when you call ggsave() you do need to specify the filename. In your code above, it's correct, since s is a number. So, for things to continue working, we will keep s as a number. So that gives you this code:

solutions <- list("s1" = s1, "s2" = s2, "s3" = s3, "s4" = s4, "s5" = s5, "s10" = s10, "s11" = s11, "s12" = s12)

for (s in 1:length(solutions)) {
  my_df <- solutions[[s]]
  ggplot() +
    ...
    geom_sf(data = my_df[, "solution_1"], aes(fill = solution_1)) +
    ...
    theme_void()

  plot_name <- paste0("plot_", s, ".png")
  ggsave(plot_name, path = path)
}

Oh my gosh! I can't thank you enough for laying this out so comprehensively!! I am coming from Python so intuitively wanted to format my for loops + functions like how I would in python. I saw the length(solutions) online before, but didn't understand what it really meant.

Particularly really grateful for how you also laid out the ggsave bit for me. This would have set me back again. Really thank you so much!

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.