I am trying to scrape a table from the web, and ultimately convert a column from a character string to numeric.
My complete set up:
library(readxl) library(janitor) library(tidyverse) library(gt) library(rvest) library(reprex)
Scraping the table didn't provide any errors, but here's the code I used:
table_costs_messy <- list() table_costs_messy <- read_html("https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2426671/")%>% html_node("table") %>% html_table(header = TRUE) %>% clean_names() %>% slice(-1) %>% slice(1:17)
Where the problem starts:
I've scraped a table and cleaned it, and have tried to remove special characters from two columns so that I will be able to convert those values to integers later. This part of the code does not throw error messages, but I think this may be where my problem starts.
table_costs_messy[table_costs_messy == "not applicable"] <-NA colnames(table_costs_messy)[2:3] <- c("xF", "xM") gsub(x = table_costs_messy, pattern = "\\$|\\*", "")
Now that I've removed the unwanted characters, I'd hope to be able to make a
new table.This is where I get an error that I have not properly removed special characters.
table_costs_clean <- table_costs_messy %>% pivot_longer(cols = starts_with("x"), names_to = "Sex", names_prefix = "x", values_to = "Cost", values_ptypes = list(Cost = integer()), values_drop_na = FALSE ) table_costs_clean #> Error: Lossy cast from <character> to <integer>. * Locations: 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17
Thanks very much for any help or advice you might have!
Created on 2020-03-06 by the reprex package (v0.3.0)