Hi all, Here is something I encounter sometime in my workflow that I haven't found a comfortable solution to. Let's say I am working on Likert-based data like the below:
# Create Likert dataset
df <- data.frame(q1 = c("Strongly Disagree", "Agree", "Disagree", "Strongly Agree"),
q2 = c("Strongly Agree", "Agree", "Strongly Disagree", "Neither Agree nor Disagree"),
q3 = c("Neither Agree nor Disagree", "Strongly Agree", "Agree", "Disagree"),
q4 = c("M", "F", "M", "F"))
I frequently find myself wanting to go between the character-based Likert data and the numeric-based Likert data depending on the need. Here is what I mean:
If I want to calculate any statistics on these Likert data, such as the correlation, I need to convert them into numeric.
> library(dplyr)
>
> # h/t https://stackoverflow.com/questions/38724850/converting-likert-data-to-numeric-across-a-data-frame
> factorise <- function(x) {
+ case_when(x %in% c("Strongly Disagree") ~ 1,
+ x %in% c("Disagree") ~ 2,
+ x %in% c("Neither Agree nor Disagree") ~ 3,
+ x %in% c("Agree") ~ 4,
+ x %in% c("Strongly Agree") ~ 5)
+ }
>
>
> df2 <- mutate_at(df, c("q1", "q2", "q3"), factorise)
>
> # To calculate statistics on the 1-5 data
> # I need the numeric-based Likert data
> cor(select(df2, -q4))
q1 q2 q3
q1 1.0000000 -0.10690450 -0.14142136
q2 -0.1069045 1.00000000 -0.07559289
q3 -0.1414214 -0.07559289 1.00000000
However, I often want to return to the character-based data for any tables or graphs so that they are clearly labeled (i.e. "Strongly Agree" instead of "5"
> # But for clearly labeled tables and graphs,
> # I need the character-based Likert data
>
> table(df$q1, df$q4)
F M
Agree 1 0
Disagree 0 1
Strongly Agree 1 0
Strongly Disagree 0 1
It's like I need some way to name or label those numeric values for the Likert variables. Any ideas on this? Thanks very much all.