rvest, function, map_dfr, url does not exist in current working directory

The reprex below is an attempt to scrape some college football data from 10 web pages and clean them up with a clean_table() function.
There appears to be an environment problem, as I create the urls, which become values in my Global environment, and the function, which also appears in my Global environment, but they do not seem to connect with each other in map_dfr.
This is probably something about environments that I don't understand, or it may be a silly typo somewhere.
Suggestions?
thanks!

library(rvest)
#> Loading required package: xml2
library(tidyverse)
library(RVerbalExpressions)
library(janitor)
#> 
#> Attaching package: 'janitor'
#> The following objects are masked from 'package:stats':
#> 
#>     chisq.test, fisher.test
library(purrr)

url_wk9 <- "https://www.espn.com/college-football/story/_/id/27940007/sp+-rankings-week-9-lsu-catch-ohio-state-alabama"

url_wk8 <- 'https://www.espn.com/college-football/story/_/id/27569383/sp+-rankings-week-2-oregon-jumps-10-spots'

url_wk7 <- "https://www.espn.com/college-football/story/_/id/27836054/sp+-rankings-week-7-ohio-state-takes-top-spot"

url_wk6 <- "https://www.espn.com/college-football/story/_/id/27677435/sp+-rankings-week-4-ohio-state-move"

url_wk5 <- 'https://www.espn.com/college-football/story/_/id/27727611/sp+-rankings-week-5-ou-ohio-state-keep-climbing'

url_wk4 <- "https://www.espn.com/college-football/story/_/id/27677435/sp+-rankings-week-4-ohio-state-move"

url_wk3 <- 'https://www.espn.com/college-football/story/_/id/27624402/sp+-rankings-week-3-georgia-moves-ahead-clemson'

url_wk2 <- 'https://www.espn.com/college-football/story/_/id/27569383/sp+-rankings-week-2-oregon-jumps-10-spots'

url_wk1 <- 'https://www.espn.com/college-football/story/_/id/27529950/sp+-rankings-week-1-wisconsin-makes-leap'

url_pre <- 'https://www.espn.com/college-football/story/_/id/27466530/preseason-sp+-rankings-alabama-tops-clemson'

vec <- paste0("url_wk", 1:9)

clean_table <- function(url) {
  url %>% 
    xml2::read_html() %>% 
    html_node("table") %>% 
    html_table() %>% 
    separate(`Team (Record)`, 
           into = c('team', 'record'), 
           sep =  " \\(" ) %>% 
    separate(team, into = c('rank', 'team'), 
             sep =  "(?<=\\d)\\. " ) %>% 
    mutate(record = str_replace(record, 
                                pattern = "\\)", 
                                replacement = "")) %>% 
    separate(Offense, 
             into = c('offense', 'o_rank'), 
             sep = " ") %>% 
    separate(Defense, 
             into = c('defense', 'd_rank'), 
             sep = " ")  %>% 
    clean_names() %>% 
    mutate(o_rank = parse_number(o_rank),
           d_rank = parse_number(d_rank))
}

sp_table <- map_dfr(vec, clean_table)
#> Error: 'url_wk1' does not exist in current working directory ('C:/Users/phiggins/AppData/Local/Temp/RtmpcxGw7R/reprex1ab44a1437d6').

Created on 2019-10-28 by the reprex package (v0.3.0)

The problem is that you are passing the string "url_wk1" to clean_table, not the variable url_wk1, so read_html() is looking for the url that is literally url_wk1 but you want it to look for
'https://www.espn.com/college-football/story/_/id/27529950/sp+-rankings-week-1-wisconsin-makes-leap'. I would define vec like this:

vec <- c("https://www.espn.com/college-football/story/_/id/27940007/sp+-rankings-week-9-lsu-catch-ohio-state-alabama",
         'https://www.espn.com/college-football/story/_/id/27569383/sp+-rankings-week-2-oregon-jumps-10-spots',
         "https://www.espn.com/college-football/story/_/id/27836054/sp+-rankings-week-7-ohio-state-takes-top-spot",
         "https://www.espn.com/college-football/story/_/id/27677435/sp+-rankings-week-4-ohio-state-move",
         'https://www.espn.com/college-football/story/_/id/27727611/sp+-rankings-week-5-ou-ohio-state-keep-climbing',
         "https://www.espn.com/college-football/story/_/id/27677435/sp+-rankings-week-4-ohio-state-move",
         'https://www.espn.com/college-football/story/_/id/27624402/sp+-rankings-week-3-georgia-moves-ahead-clemson',
         'https://www.espn.com/college-football/story/_/id/27569383/sp+-rankings-week-2-oregon-jumps-10-spots',
         'https://www.espn.com/college-football/story/_/id/27529950/sp+-rankings-week-1-wisconsin-makes-leap',
         'https://www.espn.com/college-football/story/_/id/27466530/preseason-sp+-rankings-alabama-tops-clemson')
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.