# Categorical to numeric

Please how do I convert the categorical values in column 2, 3 and 4 to numeric? I have tried to use transform () and I didn't get what I want. Let's say your data frame is called DF. You can see whether the columns are characters or factors with the command

``````str(DF)
``````

To convert characters into integers, ordering them alphabetically, you can use

``````DF\$proto <- as.numeric(as.factor(DF\$proto))
``````

If they are already factors, you can skip using the as.factor function.

Does that get you what you want?

1 Like

perfect but is there a way to do the conversion at once than doing it separately for each column since I have 3 columns?

There are a few ways to do this. If you want to change particular columns, you can use the mutate_at function from dplyr and indicate the columns by their numeric position or by name.

``````DF <- data.frame(A = 1:3, B = c("D", "F", "W"), C = c("U", "I", "U"),
D = c("Q", "W", "E"), E = 3:5, stringsAsFactors = FALSE)
str(DF)
#> 'data.frame':    3 obs. of  5 variables:
#>  \$ A: int  1 2 3
#>  \$ B: chr  "D" "F" "W"
#>  \$ C: chr  "U" "I" "U"
#>  \$ D: chr  "Q" "W" "E"
#>  \$ E: int  3 4 5
library(dplyr)

MakeNum <- function(x) as.numeric(as.factor(x))

DF <- mutate_at(DF, 2:4, MakeNum)
str(DF)
#> 'data.frame':    3 obs. of  5 variables:
#>  \$ A: int  1 2 3
#>  \$ B: num  1 2 3
#>  \$ C: num  2 1 2
#>  \$ D: num  2 3 1
#>  \$ E: int  3 4 5
``````

Created on 2020-02-27 by the reprex package (v0.3.0)

You can also use mutate_if() to affect all columns that meet a certain condition, such as all character columns

2 Likes

Just to add the new kid on the block, you can will also be able to do this with `across()` in the forthcoming release of dplyr:

If you have the development version installed, you can run:

``````library(dplyr)

DF <- data.frame(A = 1:3, B = c("D", "F", "W"), C = c("U", "I", "U"),
D = c("Q", "W", "E"), E = 3:5, stringsAsFactors = FALSE)

str(DF)
#> 'data.frame':    3 obs. of  5 variables:
#>  \$ A: int  1 2 3
#>  \$ B: chr  "D" "F" "W"
#>  \$ C: chr  "U" "I" "U"
#>  \$ D: chr  "Q" "W" "E"
#>  \$ E: int  3 4 5

MakeNum <- function(x) as.numeric(as.factor(x))

DF <- mutate(DF, across(2:4, MakeNum))

str(DF)
#> 'data.frame':    3 obs. of  5 variables:
#>  \$ A: int  1 2 3
#>  \$ B: num  1 2 3
#>  \$ C: num  2 1 2
#>  \$ D: num  2 3 1
#>  \$ E: int  3 4 5
``````

Created on 2020-02-27 by the reprex package (v0.3.0.9001)

Edit Now reflects that across is only in the dev version of dplyr.

2 Likes

I'm curious about `across()` -- what package does it come from?

dplyr, see the link at the bottom of my response, above Sorry, I didn't register the link somehow! The reason I had asked was that I had assumed it was either base `R` or `dplyr` from the code, but got no documentation from `?across, so I must already be out of date -- things change so quickly!

It hasn't been released yet, I probably should have mentioned that. No worries -- I'll have to remember to look to you for news about the latest This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.