Is there anything faster than readLines in R?

websocket
#1

Hi,
I have posted the same thing on stackoverflow but no luck with an answer so far and perhaps this is a most appropriate forum for this question.
I have a socket connection between c (client) and R (R acts as a server.. using RStudio). C sends a string of 5 numbers in R. For example:

1 16.29 3.8 0 0

In R I am receiving the string using:

    con <- socketConnection(host="localhost", port = 8080, blocking=TRUE,
                        server=TRUE, open="r+")

and the next line is:

helloTall <- readLines(con,1)

The lines are read fine but the problem is that I have to do this very frequently (it is a time step coupling exercise, and so I have to do this >1000 times, and in some cases more than 500k times).

The problem: readLines is extremely slow for this type of work, and readChar is not any better. Is there any other (much) faster way to read the above short string from a connection?

I am using R 3.5.2. readLines takes 1 minute to read these 5 numbers, so with >5000 time steps in my case I would need >5000 minutes!!!

Thank you in advance.

0 Likes

#2

There is no way readLines is literally taking that long. On my machine readLines took all of 0.03 s to read in 1 27,000 line 1 MB file.

I suspect that the connection is not being opened or closed correctly. When you say 'c' do you mean the language C or is that just a shorthand for 'client'?

1 Like

#3

If you cross-post could you please link to the cross-posted question? That way, if you do get an answer over there, someone else can tell, and avoid duplicating effort.

For more information, please see our FAQ re cross-posting:

2 Likes

#4

Thank you, I was not aware of this forum before I posted that question. Here is the link for now and if there is ever an answer I will make sure both forums see it:
https://stackoverflow.com/questions/54523171/is-there-anything-faster-than-readlines-in-r

1 Like

#5

This is interesting and I hope you are right. Please help me understand this better. By C I meant the C language. Apologies for the confusion. I am putting print statements in my code and I see the slow down happens at the readLines and not at the point where the data "arrive" in R. Could it be that the specific string that arrives has many blank spaces or something that "delays" readLines?
Here are my print statements to make this more clear:

    print("in r1")
    con <- socketConnection(host="localhost", port = 8080, blocking=TRUE,
                            server=TRUE, open="r+")
#    on.exit(close(con))
    print("in r2")
    helloTall <- readLines(con,1)
    print("in r3")

The transition between r2 and r3 takes ages.. 1 minute in this case which is a lot for 5 numbers.

0 Likes

#6

UPDATE:
I have tried scan for the content of the connection and it is equally slow:

    hello2<-scan(con, sep=" ")

It also takes a minute to read these 5 numbers. Does it mean that con has a lot of nulls or spaces or something that slows down reading its contents? I know almost nothing about connections.

0 Likes

#7

I'm also not an expert on sockets or connection, but I think I agree with the other posters: it seems like your socket connection is sitting around and eventually timing out. Clearly it's getting the data, so maybe the socket is having problems with the part where it recognises that it's done? Are the numbers sent from C terminated with a new line properly?

EDIT: from the socketConnection() documentation:

For many connections there is little or no difference between text and binary modes. For file-like connections on Windows, translation of line endings (between LF and CRLF) is done in text mode only (but text read operations on connections such as readLines, scan and source work for any form of line ending).

Maybe open = "rt" would be more appropriate for your case?

0 Likes

#8

Thank you so much. The issue was that there was no newline character ( \n ) sent over the socket and readLines was coming to a 1 minute run out limit. The full answer is given here: https://stackoverflow.com/questions/54523171/is-there-anything-faster-than-readlines-in-r-or-how-do-i-find-out-why-reading/54623278#54623278

2 Likes

#9

Great to hear, @Gko! If you could mark the topic as 'Solved', that'll help people with the same problem who find this topic in the future :slightly_smiling_face:

0 Likes

closed #10

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

0 Likes