How to load .rda files from HDFS path to current R working environment

I am trying to load a rda file from HDFS path into my current R working environment.

hdfspath <- "hdfs:///user/sankar/RDA File/JAN2017.rda"

load(hdfspath)

Got the below error.

Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
** cannot open compressed file 'hdfs:///user/sankar/RDA File/JAN2017.rda', probable reason 'No such file or directory'**

1 Like

may be your file is no exist,

try
hadoop fs -ls hdfs:///user/sankar/RDA File/JAN2017.rda

in your Rstudio terminal

I am running from CDSW - R workbench environment.

I think you need to use a specific tools (:package: or command line) to read from hdfs file system. You can't read like a classic file path. hdfs is a specific distributed file system and R load() function don't know how to deal with this.

Otherwise, you need to download the file for use in your local filesystem.

sparklyr can help I guess but not for .rda file.

Could you please let me know what are those tools or command lines . I have data in .rda files and i want to load them into my current CDSW R session.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.