- Error in storage.mode(x) <- "double" : (list) object cannot be coerced to type 'double'

I'm now getting the following error when running the collect function when creating the object "match_cat"

Error in storage.mode(x) <- "double" : (list) object cannot be coerced to type 'double'

if I separate the code into two blocks

match_cat <- df2_n2_s %>% 
             filter(SalesRankProductCategoryID %in% df2_s)

match <- collect(match_cat)

then I get this error

Error in UseMethod("escape") : no applicable method for 'escape' applied to an object of class "c('tbl_spark', 'tbl_sql', 'tbl_lazy', 'tbl')"

Any ideas?

Are you trying to do an inner join?

Actually I am trying a semi_join, I was thinking about it from a filter perspective but I'll give the semi_join() a try. Thanks so much for your help! Do you suggest I use collect() at the end of the code block with semi_join()?

If you want to get the dataframe back to your R session you can use collect(), but since it needs to fit in memory in R the data frame has to be small enough.

1 Like

With spark, I understand that every pipe flow you make create lazy operation. Everything could be run when using collect. I wonder if the issue would not come from somewhere before when you create df2_n2_s... :thinking:

1 Like

The connection looks like it is working alright, I'm thinking its not liking the filtering by a matching column within a different df. When I used semi_join() it appears to work but I'm still figuring out how to look at the structure of the dataframe in the connection haha. I'm still getting used to the difference between working in a spark context vs with R objects in the global environment.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.