Hey all.
I want to do a small scraping project.
The core idea is to have a big dataset with 3 coloumns:
a from currency ($$), a to currency (€€) and a ratio.
The point is that each row will be a different combo of currencies and at the end i'll have all the rates available.
My problem:
- when running a loop (both for loop and map) I get this error:
"Error in open.connection(x, "rb") : HTTP error 404." - Sometimes, for no apparent reason, map is able to scrape data, but then the code is wrong and i literaly copied my own code that worked fine on a different issue:
amount from to ratio
<dbl> <chr> <chr> <dbl>
1 10 uyu uyu 0.984
2 10 iqd uyu 0.984
3 10 btn uyu 0.984
4 10 tzs uyu 0.984
5 10 awg uyu 0.984
6 10 uyu iqd 0.984
7 10 iqd iqd 0.984
100% reproducible code:
library(tidyverse)
library(rvest)
# this is the function to convert currencies.
currency_converter <- function(amount=1,
from="eur",
to="usd"){
# Start by assigning the special HTML node:
node=".text-success"
# Define web link with shortened variable names:
f=from
t=to
web_link <- paste0("https://wise.com/gb/currency-converter/",f,"-to-",t,"-rate")
# Extract table:
current_value <- rvest::read_html(web_link) %>%
rvest::html_nodes(node) %>%
rvest::html_text()
current_value <- str_split(current_value,pattern = " ")[[1]] %>%
as.numeric()
# Return the corect amount:
#cat("Source: http://www.wise.com\n")
return(current_value*amount)
}
# this is how i get all the abbriviations for currencies
website <- "https://www.easymarkets.com/eu/learn-centre/discover-trading/currency-acronyms-and-abbreviations/"
country.node <- "td:nth-child(1)"
currency.node <- "td+ td"
country <- rvest::read_html(website) %>%
rvest::html_nodes(country.node) %>%
rvest::html_text()
currency <- rvest::read_html(website) %>%
rvest::html_nodes(currency.node) %>%
rvest::html_text()
countries_currency <- tibble(country,currency)
# just minor cleaning
countries_currency <- countries_currency[-which(str_detect(countries_currency$currency,"EURO")),]
head(countries_currency)
# train data before i go big time
train <- sample_n(countries_currency,5)
# the idea is to have for each currency all other currencies,
# and then apply my function and get the ratio
# filtering ratio != 0 (since it is the same currency)
# and then get a big table with all currencies for all other currencies
# this should be a huge table (nrow() ^2)
trainloop <- tibble(amount=10,from=str_to_lower(rep(train$currency,5)),
to=str_to_lower(rep(train$currency,each=5))
)
head(trainloop)
trainloop %>%
mutate(ratio=pmap_dbl(cur_data(),currency_converter))
# does not seem to work
keep <- c()
for(i in 1:nrow(trainloop)){
for(to_ in trainloop$to){
for(from_ in trainloop$from){
keep[i] <- currency_converter(1,from=from_,to=to_)
}
}
}
currency_converter(from=from_,to=to_) #error
currency_converter() #not error
currency_converter(amount = 100,from = "eur",to = "jpy") #not error