Sparklyr + dplyr error - object not found!

Hello I am just getting started using Sparklyr and I am getting an error when trying to use dplyr to wrangle some data. I'm sure its something stupid I am overlooking but I have searched in multiple areas for an answer and I don't see anyone else getting this error :confused:

library(sparklyr)

sc <- spark_connect(master = "local")

spark_read_csv(sc, "df2_tbl", 
"C:/Users/...csv")

 spark_read_csv(sc, "df_n2_tbl", 
"C:/Users/...csv")

I see the objects "df2_tbl" and "df2_n2_tbl" in the "Connections" tab next to "Environment" and "History" as well as on the Spark UI, but when I run the following

match_cat <- df_n2_tbl %>% 
         filter(var1 %in% df2_tbl) %>% 
         collect()

I get the error -

"Error in eval(lhs, parent, parent) : object 'df_n2_tbl' not found"

Thanks!
Dan

You'll need to assign the results of the spark_read_csv()s to something, e.g. df2_tbl <- spark_read_csv(sc, "df2_tbl", "C:/Users/...csv").

1 Like

Oops never mind I was getting another error. Thanks!

4 posts were split to a new topic: sparklyr - Error in storage.mode(x) <- "double" : (list) object cannot be coerced to type 'double'

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.