Issue with converting character to numeric

Hello,

I have some columns that when I try to convert to numeric, gets converted to NAs instead. Columns C and D have numeric values, but gets replaced with NA when I try to use as.numeric. The original data is huge and I have several columns with different classes (class 1 - 5) from column "class". I would like to convert all data into numeric leaving date and class columns. Is there anyway to convert everything to numeric correctly without replacing with NAs instead.

Thanks for your help!

df <- data.frame(
  stringsAsFactors = FALSE,
                 date = c("2000-01-01",
                          "2000-04-01","2000-07-01","2000-10-01"),
             class = c("Class 5", "Class 5", "Class 5", "Class 5"),
                 A = c(11882, 14281, 8846, 8432),
                 B = c("89.60", "92.00", "92.30", "92.60"),
                    C = c("166,834,223",
                          "168,695,652","168,387,103","169,647,762"),
                    D = c("21,344,544",
                          "21,298,930","21,236,265","21,471,803")
   )

How about this?

library(readr)
library(dplyr)

df <- data.frame(
    stringsAsFactors = FALSE,
    date = c("2000-01-01",
             "2000-04-01","2000-07-01","2000-10-01"),
    class = c("Class 5", "Class 5", "Class 5", "Class 5"),
    A = c(11882, 14281, 8846, 8432),
    B = c("89.60", "92.00", "92.30", "92.60"),
    C = c("166,834,223",
          "168,695,652","168,387,103","169,647,762"),
    D = c("21,344,544",
          "21,298,930","21,236,265","21,471,803")
)


df %>% 
    mutate(across(B:D, parse_number))
#>         date   class     A    B         C        D
#> 1 2000-01-01 Class 5 11882 89.6 166834223 21344544
#> 2 2000-04-01 Class 5 14281 92.0 168695652 21298930
#> 3 2000-07-01 Class 5  8846 92.3 168387103 21236265
#> 4 2000-10-01 Class 5  8432 92.6 169647762 21471803

Created on 2021-03-19 by the reprex package (v1.0.0.9002)

Thank you @andresrcs!

It is working from B:D. But it doesn't work from A:D. Because I have several variables and many are characters, some are numeric, I was thinking of using all the columns except class and date. I tried your method and this removes Class from all classes such as keeps 5 for Class 5 , but the rest of the columns seems to work fine. Maybe I can just introduce Class word in this scenario unless we can have other way.

df <- df%>% 
    mutate(across(where(is.character), parse_number))

You can do something like this

library(readr)
library(dplyr)

df <- data.frame(
    stringsAsFactors = FALSE,
    date = c("2000-01-01",
             "2000-04-01","2000-07-01","2000-10-01"),
    class = c("Class 5", "Class 5", "Class 5", "Class 5"),
    A = c(11882, 14281, 8846, 8432),
    B = c("89.60", "92.00", "92.30", "92.60"),
    C = c("166,834,223",
          "168,695,652","168,387,103","169,647,762"),
    D = c("21,344,544",
          "21,298,930","21,236,265","21,471,803")
)


df %>% 
    mutate(across(c(where(is.character), -date, -class), parse_number))
#>         date   class     A    B         C        D
#> 1 2000-01-01 Class 5 11882 89.6 166834223 21344544
#> 2 2000-04-01 Class 5 14281 92.0 168695652 21298930
#> 3 2000-07-01 Class 5  8846 92.3 168387103 21236265
#> 4 2000-10-01 Class 5  8432 92.6 169647762 21471803

Created on 2021-03-19 by the reprex package (v1.0.0.9002)

Perfect! Thank you @andresrcs !

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.