Unable to connect to sparklyr from R or RStudio

I am trying to connect to spark from R studio.currently we are using cloudera hadoop distribution where the spark(2.2) is running.I tested everything from edge node, I was able to create spark context and execute my queries as well. Everything works fine till yesterday from Rstudio, suddenly we have issues from R Studio.

library(dplyr)    
library(sparklyr)   
config <- spark_config()    
config$spark.driver.memory <- "8G"    
config$spark.executor.memory <- "8G"    
config$spark.executor.executor <- "2"    
config$spark.executor.cores <- "4"    
config$spark.kryoserializer.buffer.max <- "2000m"    
config$spark.driver.maxResultSize <- "4G"    
config$spark.akka.frameSize <- "768"    
sc <- spark_connect(master="yarn-client",     
version="2.2.0",    
 config=config,     
spark_home = '/opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2')

Error in force(code) : Failed while connecting to sparklyr to port (8880) for sessionid (14727): Sparklyr gateway did not respond while retrieving ports information after 60 seconds Path: /opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2/bin/spark-submit Parameters: --class, sparklyr.Shell, '/usr/lib64/R/library/sparklyr/java/sparklyr-2.2-2.11.jar', 8880, 14727 Log: /tmp/RtmpoNJQEH/file151b437c0313b_spark.log

---- Output Log ---- 18/11/12 13:54:50 INFO sparklyr: Session (14727) is starting under 127.0.0.1 port 8880 18/11/12 13:54:50 INFO sparklyr: Session (14727) found port 8880 is not available 18/11/12 13:54:50 INFO sparklyr: Backend (14727) found port 8884 is available 18/11/12 13:54:50 INFO sparklyr: Backend (14727) is registering session in gateway 18/11/12 13:54:50 INFO sparklyr: Backend (14727) is waiting for registration in gateway

---- Error Log ----

I verified the version for sparklyr as well, it was 0.9.2

Can some please let me know what could be the wrong ?

Does this work for you in R, but not in RStudio? (see FAQ below for disambiguation)

I tried from R as well,getting the same error

This means that port 8880 is being used, can you try using a different port, say 8890?

config$sparklyr.gateway.port <- 8890

To help troubleshoot, you would have to find out which application is using this port first. It could be possible that there is a zombie instance of sparklyr running in this port in which case, killing this application in yarn could also fix this problem without having to change your configuration settings.

2 Likes

Is there a way to automatically roll over to another port like 8890 if the default port (8880) is in use? Can we detect this issue before the connection actually fails?
Thanks!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.