Deal with RCurl Connection Timeout

I am in need of Connectivity experts. I have been stuck for weeks, I can't figure out why does this RCurl Snippets likes to Timed Out in my case?

links <- c("https://www.investing.com/currencies/us-dollar-index-historical-data",
                 "https://www.investing.com/rates-bonds/u.s.-10-year-bond-yield-historical-data")
      get_string <- c()
      for(x in 1:length(links)){
        con <- curl(links[ x ], "rb")
        open(con)
        html_string <- readLines(con, n = 3000)
        idx_string <- which(html_string %like% "<span class=\"arial_26 inlineblock pid-")
        price_head <- unlist(strsplit(unlist(strsplit(html_string[idx_string],">"))[2],"<"))[1]
        #get_string <- c(get_string, html_string[idx_string])
        get_string <- c(get_string, price_head)
        close(con)
      }
      return(get_string)

I had to wrap this snippets in while() and try catch to see that it finally successfully connected after 5-10x Timeout Error

Error in open.connection(con, open = mode) : 
  Failed to connect to www.investing.com port 443: Timed out
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Error in open.connection(con, open = mode) : 
  Failed to connect to www.investing.com port 443: Timed out
In addition: There were 11 warnings (use warnings() to see them)
Error in open.connection(con, open = mode) : 
  Failed to connect to www.investing.com port 443: Timed out
In addition: Warning messages:
1: In open.connection(con) : connection is already open
2: In open.connection(con) : connection is already open
Error in open.connection(con, open = mode) : 
  Failed to connect to www.investing.com port 443: Timed out
In addition: Warning messages:
1: In open.connection(con) : connection is already open
2: In open.connection(con) : connection is already open
3: In open.connection(con) : connection is already open
Error in open.connection(con, open = mode) : 
  Failed to connect to www.investing.com port 443: Timed out
In addition: There were 11 warnings (use warnings() to see them)
Error in open.connection(con, open = mode) : 
  Failed to connect to www.investing.com port 443: Timed out
In addition: Warning messages:
1: In open.connection(con) : connection is already open
2: In open.connection(con) : connection is already open
3: In open.connection(con) : connection is already open
4: In open.connection(con) : connection is already open
5: In open.connection(con) : connection is already open
6: In open.connection(con) : connection is already open
7: In open.connection(con) : connection is already open
Done Streaming Data for US CPI Model #--> This is Success Message

This is so far what I have try in my own:

  1. My Ping Status to the desired website host is actually fine:

Ping

  1. Firewall Inbound and Outbound rules, I have set the allowance to port 443

  2. When I changed my connection to mobile hotspot, this issue doesnt exist, it just successfully connected for 1 time only

  3. I have flushed the DNS by Windows CMD:

    Type "ipconfig /flushdns" and press Enter.
    Type "ipconfig /registerdns" and press Enter.
    Type "ipconfig /release" and press Enter.
    Type "ipconfig /renew" and press Enter.
    Type "netsh winsock reset" and press Enter.

  4. Last 2 week the connection was ok (1xattempt), but this issues started the next week it was ok.

  5. I can browse the website directly through the browser without issues

  6. I wanted to set this a longer timeout, but I didn't understand how

There may be other things going on, but it looks like you need to fix

first, because it comes up empty.

library(curl)
#> Using libcurl 7.81.0 with OpenSSL/3.0.2
library(DescTools)

con <- curl("https://www.investing.com/currencies/us-dollar-index-historical-data", "rb")
open(con)
#> Warning in open.connection(con): connection is already open
html_string <- readLines(con, n = 3000)
grep("ariel_26",html_string,value = TRUE)
#> character(0)
close(con)

Created on 2022-11-17 by the reprex package (v2.0.1)

1 Like

Thanks for pointing that out @technocrat, yes i had replaced the which() to grep() as you suggested, sometimes it didn't get the part with which().. unfortunately the connection timed out still persist.. I still need to see which is the root cause..

Another thing to check is whether the site throttles requests that are too frequent (which possibly can be fixed with a short delay loop) or exceed some hourly or daily limit. See if they have an API, which might document this.

ISP Throttling, This is new things for me to consider! Thank you will check this out ASAP

1 Like

Probably not your ISP, but the target site. Here's an example for how it works with the GitHub API, but it looks like investing.com does not have an API. I'd try building in a short delay between queries.

1 Like

The connection is already fine now, I haven't see any ISP Throttle when I do a connection health and port check,but I had to see that my ISP settings default mode using something like Mixed Frequency Range (IEEE802.11 a + b + n) and an auto bandwidth, When I changed it to static bandwidth of 20 mbps, and the Frequency Range of IEEE802.11n only, (While I am not sure what is going on) I also use port forwarding to port 80 and 443, after that, the connection timed out didn't appear anymore as per now. @technocrat thanks for your assistance and feedback!

Unless you where using a wireless connection with very poor quality and a lot of external interference, this is very unlikely to be related to your issue since it relates to the wireless standard your Access Point is using and port forwarding only affects inbound traffic on those ports so also not applicable.

Yes, actually I would like to know more if the case wasn't with the poor quality of connection (if in the future I or someone else with same issues happen to get timeout error). like @technocrat mention that maybe it was the target site, but since this site (investing.com) didn't have an API. I am not sure how to do a workaround either

If you send requests too fast, the site might block you to prevent a DOS attack, try adding some sleep time between requests with Sys.sleep(time)

1 Like

Sure, I will keep that in mind, its good to go for now

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.