# How to convert multiple factor variable to numeric in R?

I have a data frame and most of the variables are factors. I want to convert them to numeric.
I have applied this code:

``````df[] <- lapply(df, function(x) as.numeric(as.character(x)))

``````

But it made some changes in the values.

Here is the data:

``````
data.frame':	447195 obs. of  22 variables:
\$ WEIGHT    : num  10 10 8.75 8.75 8.75 8.75 8.75 8.75 8.75 10 ...
\$ URBAN_RURA: int  2 2 2 2 2 2 2 2 2 2 ...
\$ SEX       : Factor w/ 2 levels "1","2": 1 2 1 2 2 1 1 1 2 1 ...
\$ AGE       : Factor w/ 99 levels "0","1","10","11",..: 75 67 41 25 4 68 24 2 47 49 ...
\$ BIRTHPROVI: Factor w/ 34 levels "11","12","13",..: 1 1 1 1 1 1 1 1 1 1 ...
\$ BIRTHDISTR: Factor w/ 38 levels "1","10","11",..: 1 1 1 1 1 1 1 1 1 1 ...
\$ PROV5     : Factor w/ 33 levels "11","12","13",..: 1 1 1 1 1 1 33 33 1 1 ...
\$ DISTRICT5 : Factor w/ 37 levels "1","10","11",..: 1 1 1 1 1 1 NA NA 1 1 ...
\$ SPEAK_INDO: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 NA NA 1 1 ...
\$ EDUCATION : Factor w/ 10 levels "0","1","2","3",..: 2 2 5 2 2 2 NA NA 2 3 ...
\$ LATIN_LITE: Factor w/ 2 levels "1","2": 1 2 1 1 1 1 NA NA 2 1 ...
\$ NUMBER    : int  1 1 30 30 30 30 30 30 30 31 ...
\$ DATEBORN  : Factor w/ 31 levels "1","10","11",..: 1 1 2 20 12 17 14 4 4 18 ...
\$ MONTHBORN : Factor w/ 12 levels "1","10","11",..: 10 10 2 10 9 3 5 5 5 7 ...
\$ YEARBORN  : Factor w/ 99 levels "1912","1913",..: 22 29 54 68 87 91 96 98 49 47 ...
\$ PROVINCE  : Factor w/ 1 level "11": 1 1 1 1 1 1 1 1 1 1 ...
\$ DISTRICT  : Factor w/ 23 levels "01","02","03",..: 1 1 1 1 1 1 1 1 1 1 ...
\$ SUB_DISTRI: Factor w/ 60 levels "010","011","012",..: 1 1 1 1 1 1 1 1 1 1 ...
\$ VILLAGE   : Factor w/ 122 levels "001","002","003",..: 1 1 1 1 1 1 1 1 1 1 ...
\$ INDUSTRY  : Factor w/ 19 levels "1","10","11",..: NA NA 1 NA NA NA NA NA NA 14 ...
\$ PERSNUM   : Factor w/ 320 levels "1","10","100",..: 1 112 1 112 223 255 266 277 288 1 ...
\$ RELAT     : Factor w/ 10 levels "0","1","2","3",..: 2 3 2 3 4 4 4 4 8 2 ...
``````

Thanks for helping me.

Which columns are not working as expected? If I make a small data set, all of the changes using your code are just what I expect.

``````> DF <- data.frame(Weight = c(8.75,10,7.65,11.23),
+                  Age = c("1","10","2","11"),
+                  SUB = c("010", "011", "013", "014"),stringsAsFactors = TRUE)
>
> DF[] <- lapply(DF, function(x) as.numeric(as.character(x)))
> DF
Weight Age SUB
1   8.75   1  10
2  10.00  10  11
3   7.65   2  13
4  11.23  11  14
``````

There are three data frames each of them has more than 23.6 million rows and around 40 columns. When I converted the factor to numeric the missing values (NA) made problems. I got a warning.

If you know that there are NAs in the data, the warning is not a problem. Warnings are different from errors. A warning alerts you about something you might be concerned about. If you understand the origin of the warning, it is not a problem. An error is always a problem. In the example below, I put NAs in the data to produce the warning. You can see that non-NA values are all correct.

``````DF <- data.frame(Weight = c(8.75,10,NA,11.23),
Age = c("1","10","NA","11"),
SUB = c("010", NA, "013", "014"),stringsAsFactors = TRUE)
DF[] <- lapply(DF, function(x) as.numeric(as.character(x)))
Warning message:
In FUN(X[[i]], ...) : NAs introduced by coercion
DF
Weight Age SUB
1   8.75   1  10
2  10.00  10  NA
3     NA  NA  13
4  11.23  11  14
``````

@FJCC Thank you for your explanation.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.