SOLVED: Recode variable from text to numbers

recode

#1

Hi guys, I need your help. :sweat:

I have a variable with text as answers and I´d like recode it into a new variable with a number for every answer.

The text answers are like "Strongly disagree", "Disagree" and so on.
I tried this command:

data$variable.r<-recode(data$variable, "'strongly disagree=0; 'disagree'=1; 'agree'=2; 'strongly agree'=3")

But it doesn´t work. :thinking:

Is there a command in R to recode text to numbers?

Thanks very much!


#2

This sounds like a great spot to use case_when from dplyr.

https://dplyr.tidyverse.org/reference/case_when.html


#3

Thanks for your quick reply, @jonspring!

The examples for case_when are the other way around: from number to text and just for tables and not creating a new variable. I´m stumped.

So the command would be: data$variable.r <- case_when(x %% "strongly disagree" == 0 ~ 0, x %% "disagree" == 0 ~ 1, x %% "agree" == 0 ~ 2, x %% "strongly agree" == 0 ~ 3, TRUE ~ as.character(x))

Right? In which package is case_when in, 'cause I can´t load it.


#4

(Trying to write this on phone so please ignore typos!)

library(dplyr)
data <- 
  data %>%
  mutate(new_x = case_when(
  x == “strongly disagree”  ~ 0,
  x == “disagree”  ~ 1,
  x == “agree”  ~ 2,
  x == “strongly agree” ~ 3,
  TRUE   ~ NA_real)

#5

Sorry, it´s still not working, @jonspring :sob:

That´s the error I got:

Error: unexpected input in "data$variable.r <- data$variable %>% mutate(new_x = case_when(x == “"


#6

Dplyr’s mutate function takes the data frame as the first term, and then the variable you want to create or change with its formula.

So to add the variable variable.r to data, you’d write mutate(data, variable.r = your_formula_here) or alternatively, data %>% mutate(variable.r = your_formula_here).

https://dplyr.tidyverse.org/reference/mutate.html


#7

Still not working, @jonspring

Same error, and your first command is different to your second. I don´t get it. :confused:


#8

Function recode from dplyr takes separate named arguments to describe replacements (see ... description). In your case it is sufficient to change the code into:

data$variable.r<-recode(data$variable, 'strongly disagree'=0, 'disagree'=1, 'agree'=2, 'strongly agree'=3)

Here 'strongly disagree' and others are argument names, 1 and others - argument values.


#9

Here are some R solutions. There is no need to bring in packages for such a straightforward task.

Suppose your data is the following:

d <- data.frame(variable = c("strongly disagree", "disagree", "agree", "strongly agree"))

If you don't care what code each string gets (as long as it is just consistent), then factors can do the work for you:

d$variable.r <- as.integer(as.factor(d$variable))

If you want to control the mapping, just make named vector and apply it like thus.

mapping <- c("strongly disagree" = 0, "disagree" = 1, "agree" = 2, "strongly agree" = 3)
d$variable.r <- mapping[d$variable]

#10

Thank you very much for your replies, guys! :clap:

I got it by

data$variable.n <- as.numeric(data$variable) :star_struck:

So I got numbers 1-4 and then I simply recode it to 0-3. :hugs:


#11

As your question has been answered, would you mind choosing a solution properly ? I saw you put it in the title, but you can mark it using discourse feature.
Here’s how to do it:

Thanks!


#12

SOLUTION

Before:

data$variable.r<-recode(data$variable, "'strongly disagree=0; 'disagree'=1; 'agree'=2; 'strongly agree'=3")

After:

data$variable.n <- as.numeric(data$variable)


#13

recode should be good for your situation! But usually it's used inside mutate (I've never tried using it with a regular assignment). Also, the arguments in recode (as in any function) should be separated by commas, not semi-colons. What kind of error do you get if you try:

data = data %>%
  mutate(variable.r = recode(data$variable,
    "strongly disagree" = 0, "disagree" = 1, "agree" = 2, "strongly agree" = 3))

#14

recode takes a vector and gives back a vector. You can use it that way without mutate. It works well with mutate for the same reason. Both are working, as in the example below

library(dplyr, warn.conflicts = FALSE)
data <- tibble(variable = c("strongly disagree","agree", "disagree", "strongly agree")) 
data <- data %>%
  mutate(variable.r = recode(variable,
                             "strongly disagree" = 0, "disagree" = 1, "agree" = 2, "strongly agree" = 3))
#> Warning: le package 'bindrcpp' a été compilé avec la version R 3.4.4
data$variable.r2 <- recode(data$variable,
                           "strongly disagree" = 0, "disagree" = 1, "agree" = 2, "strongly agree" = 3) 
data
#> # A tibble: 4 x 3
#>   variable          variable.r variable.r2
#>   <chr>                  <dbl>       <dbl>
#> 1 strongly disagree          0           0
#> 2 agree                      2           2
#> 3 disagree                   1           1
#> 4 strongly agree             3           3

Created on 2018-07-07 by the reprex package (v0.2.0).