Error - Could not resolve host: SOME_URL

Hi R Studio Community,

I've got a problem I'm stumped on. I'm trying to determine if a URL exists first, and then if it does exist, test if it redirects to another URL. I've got it working for the most part, but a certain type of URL is giving me an issue. When I try to use either the GET() or HEAD() functions on this type of URL I get the error...

Error in curl::curl_fetch_memory(url, handle = handle) :
Could not resolve host:

I'll call the example URL "SOME_URL" in the code below because as a new user, apparently I can only put two URLs in my question.

So SOME_URL in the code below should be interpreted as: SOME_URL = "aporhopi.org" which is an example of the kind of URL that is causing the issue.

I tried use the http_error() function as described by DataCamp.

But it doesn't seem to work for the pesky SOME_URL

Any guidance you could offer would be most appreciated.

# This works............
library(httr)
fake_url <- "google.com/fakepagethatdoesnotexist"

# Make the GET request
request_result <- httr::GET(fake_url)

# Check request_result
if(http_error(request_result)){
  warning("The request failed")
} else {
  content(request_result)
}  
  
# This doesn't work.............
  library(httr)
  fake_url <- "SOME_URL"
  
  # Make the GET request
  request_result <- httr::GET(fake_url)
  
  # Check request_result
  if(http_error(request_result)){
    warning("The request failed")
  } else {
    content(request_result)
}

Try with this:

library(httr)
url<- "google.com/fakepagethatdoesnotexist"

url <- GET(url, add_headers('user-agent' = 'Gov employment data scraper ([[your email]])'))

Check a real example

library(rvest)
url<-"https://www.tripadvisor.es/Restaurant_Review-g294308-d12354937-Reviews-or00-De_La_Llama-Quito_Pichincha_Province.html"

url <- GET(url, add_headers('user-agent' = 'Gov employment data scraper ([[your email]])'))

text_html <- url %>% read_html()
text_html

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.