Hello, I'm planning a short workshop on web scraping and want the students to be able to use RStudio Cloud. When trying to connect to the site of interest to check the robots.txt file I repeatedly get this error:
Error in curl::curl_fetch_memory(url, handle = handle) : Failed to connect to www.fanfiction.net port 443: Connection timed out
The second line is the most important as it is common to a few errors I've received when trying different ways to connect.
This doesn't happen when using RStudio desktop. Is it a proxy thing? If it is - what does that mean and what would a solution look like?
I was able to reproduce your issue, but investigating a bit further using a few different aws zones and regions this actually appears to be www.fanfiction.net (or their parent host, www.fictionpress.com) blocking all or nearly all AWS-based ip addresses. I'm not exactly sure why they would go about blocking all of these addresses in this way, but unfortunately it isn't really something we can have much of an affect on, as this issue appears to be pervasive far beyond the few addresses we manage.