scraping website returns 403

EDIT:
this can be solved by passing a user agent to the check.

when running

$ curl -I https://support.rstudio.com/hc/en-us | grep http.2 -i
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
HTTP/2 403

is that supposed to happen? if not, can it be fixed?

(happens on other sites aswell)

Can you give a short reprex for what you're running here?

The HTTP 403 Forbidden response status code indicates that the server understands the request but refuses to authorize it. 403 Forbidden - HTTP | MDN

I think I've received this error a number of ways, querying a domain too many times while scraping, pinging an API without the right credentials.

I tried to scrape the status of a support response.
it came back with 403 but also returned a huge payload back.

the URL I was trying to check for validity is https://support.rstudio.com/hc/en-us/articles/115011846747-Using-Themes-in-the-RStudio-IDE#dark-theme.

This issue is not a rmarkdown issue but an issue with support.rstudio.com as a whole.

PS, I was able to recieve the data from the site without logging in so it's not a credential thing

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.