Having problem converting to numeric from character

Hello people,
I have a problem here again. Look at the following:
Variables: 7
V1 <dttm> 2016-04-29 00:00:00, 2016-04-29 01:00:00, 2016-04-29 02:00:00, 201... V2 "44448570", "44448570", "44448570", "44448570", "44448570", "444485...
V3 <dttm> 2016-04-28 22:48:00, 2016-04-28 23:48:00, 2016-04-29 00:48:00, 201... V4 "57'186", "57'186", "57'186", "57'186", "57'186", "57'188", "57'192...
V37 <chr> "Cold Water", "Cold Water", "Cold Water", "Cold Water", "Cold Water... V38 "47'00", "47'00", "47'00", "47'00", "47'00", "47'00", "47'00", "47'...
$ V39 "SV", "SV", "SV", "SV", "SV", "SV", "SV", "SV", "SV", "SV", "SV", "...

DHW2 <- data.frame(apply(mydata,2,as.numeric))
Error in apply(mydata, 2, as.numeric) : object 'mydata' not found
DHW2 <- data.frame(apply(DHW,2,as.numeric))
Warning messages:
1: In apply(DHW, 2, as.numeric) : NAs introduced by coercion
2: In apply(DHW, 2, as.numeric) : NAs introduced by coercion
3: In apply(DHW, 2, as.numeric) : NAs introduced by coercion
4: In apply(DHW, 2, as.numeric) : NAs introduced by coercion
5: In apply(DHW, 2, as.numeric) : NAs introduced by coercion
6: In apply(DHW, 2, as.numeric) : NAs introduced by coercion

I am trying to convert my column V4 to numeric, which is currently in character form. It gives me the above error. I searched a lot for it on online forums & then saw at one place that if the data uses "," as the decimal separator instead of ".", it will always give an error. I have also tried to convert it to dot as the decimal separator but nothing seems to be working out. I just want my column V4 to be converted to numeric values.
Any suggestions would be appreciated.

You can use the sub() function to replace the ' with .
I used the str() function to show that the variable type of the column has changed.
For my own education, where is ' used as the decimal separator?

DF <- data.frame( V4 =c( "57'186", "57'186", "57'186", "57'186", 
                         "57'186", "57'188", "57'192"), stringsAsFactors = FALSE)
DF
#>       V4
#> 1 57'186
#> 2 57'186
#> 3 57'186
#> 4 57'186
#> 5 57'186
#> 6 57'188
#> 7 57'192
str(DF)
#> 'data.frame':    7 obs. of  1 variable:
#>  $ V4: chr  "57'186" "57'186" "57'186" "57'186" ...
DF$V4 <- as.numeric(sub("'", ".", DF$V4))
str(DF)
#> 'data.frame':    7 obs. of  1 variable:
#>  $ V4: num  57.2 57.2 57.2 57.2 57.2 ...

Created on 2019-10-18 by the reprex package (v0.3.0.9000)

1 Like

But I have around 2 million rows under that variable, I can't list all of them, any different way of writing it.

You don't need to list them, FJCC is exemplifying the solution with sample data, you just have to replace DF$V4 with your actual data.

1 Like

Thanks a lot, It worked.:slightly_smiling_face:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.