I would like to use my custom-made library, which is not published in CRAN.
To my knowledge so far, I can evaluate a user defined function via spark_apply and spark_apply doesn't like nested functions.
My question is: I'm connecting locally via spark_connect and trying to execute a function from my custom-made library. It is not working. Is this happening because of the existence of nested functions in my package or for some other reason?
Is there any workaround in order to use directly the functions of my package?
Have you already considered not using spark_apply()? What functionality does your library provide not available in Spark or one of the available extensions?
Assuming you do need spark_apply(), before using your library, make sure a subset of your data can be transformed in spark_apply() by running something similar to:
data %>% head(n = 1000) %>% spark_apply(~ .x)
Do notice that spark_apply() expects a data frame as input and your R transformation must also return a data frame. If your library returns a matrix or other objects, you will have to manually transform the output into a data frame.
I would like to use my library, as it provides me with models under specific optimization techniques, that can't be found directly in another package. Since I know that my functions work and produce correct results, I would like to keep using them.
That's why I'd like to use spark_apply and modify the inputs - outputs in the required data frame structure.
Of course my library is based on other R libraries.