Why does subset of large named vector in R result in fatal error?

Posted the below over on stackoverflow but have since found that 4.1.1 and 4.1.0 throw up fatal errors whereas 4.0.1 does not on the same system (64 bit Windows).

Is this something that R devs might know more about? Why is this error occurring on later versions?


I have two large named vectors and want to subset bar to give a vector that only includes those elements matching named elements in foo.

The data look like this:

# foo
> length(foo)
[1] 51174

> foo[1:10]
  the    to     a   you   and    of     i    is  that    in 
25499 24985 23053 22064 20687 16042 15351 13776 13714  9995 

> names(foo[1:10])
[1] "the"  "to"   "a"    "you"  "and"  "of"   "i"    "is"   "that" "in"  

# bar
> length(bar)
[1] 3755242

> bar[1:10]
     the       of       to      and        a       in       is     that      for        i 
16330164  9263180  8295547  8159378  6391771  5410076  3585765  3531807  2881175  2465688 

> names(bar[1:10])
[1] "the"  "of"   "to"   "and"  "a"    "in"   "is"   "that" "for"  "i" 

To subset, I have previously used the following method without issue on the same data:

bar[names(foo)]

However, using this same method, I am now receiving an "R Session Aborted R encountered a fatal error" message unpredictably. Sometimes it works, sometimes it doesn't. How do I avoid causing a fatal error? Is this a memory issue?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.