Why the list size is too large?

I am trying to run a loop on a data frame containing 27820 items. My code looks like

total.male <-  list()
for(yr in 1:32) {
  num.year = 0
  for(i in 1:27820) {
    if (suiciderate[i,"sex"] == "male" && suiciderate[i, "Year"] == yr) {
      num.year <- num.year + 1 
    }
    total.male <- list(total.male, num.year)
  }
}

When I try this, R studio pop up an error and terminate, saying fatal error. I guess maybe there are so much data that leads to this problem. So I reduce this to run the first 1000 items but
Screen Shot 2020-05-06 at 7.44.09 PM
why the list is large like this? It's so weird that the size is 3.7MB with only two elements?

Thank you

If you run this code, you will see that you end up with a deeply nested structure. I do not know why it is so big in your case but it would certainly be hard to work with.

total.male <- list()
total.male <- list(total.male, 1)
total.male <- list(total.male, 2)
total.male <- list(total.male, 3)
str(total.male)

I suggest you use functions from dplyr to do this calculation rather than a for loop. Here is an example with toy data.

set.seed(1)
suiciderate <- data.frame(sex = sample(c("F", "M"), 10, replace = TRUE),
                          Year = sample(1:2, 10, replace = TRUE))
suiciderate
#>    sex Year
#> 1    F    1
#> 2    M    1
#> 3    F    1
#> 4    F    1
#> 5    M    1
#> 6    F    2
#> 7    F    2
#> 8    F    2
#> 9    M    2
#> 10   M    1
library(dplyr)
Stats <- suiciderate %>% group_by(Year) %>% summarize(Males = sum(sex == "M"))
Stats
#> # A tibble: 2 x 2
#>    Year Males
#>   <int> <int>
#> 1     1     3
#> 2     2     1

Created on 2020-05-06 by the reprex package (v0.3.0)

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Thanks for your answer! Is there any other ways people usually do to replace a tedious tow loop in R?

I have often read that the use of explicit loops is less common in R than in other languages. In base R there is the apply family of functions (apply(), lapply(), sapply(), tapply() ...) and in the tidyverse the packages tidyr, dplyr and purrr have many functions for shaping and analyzing data. I suggest you look at the free book R for Data Science as a starting point for learning about tidy functions.

1 Like