How to transpose dataframe and calculate percent of total for Likert data

Andrzej · January 19, 2022, 8:31pm

Hi ALL,
I have got a following dataframe:

my_temp <- structure(list(DES1 = structure(c(
  0, 2, 2, 0, 1, 3, 1, 1, 4,
  3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES2 = structure(c(
  2,
  1, 3, 2, 1, 2, 1, 4, 1, 3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES3 = structure(c(
  0,
  1, 0, 1, 1, 2, 4, 0, 4, 1
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES4 = structure(c(
  4,
  3, 0, 1, 2, 2, 2, 0, 3, 3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES5 = structure(c(
  3,
  1, 2, 4, 2, 2, 4, 4, 4, 4
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES6 = structure(c(
  3,
  3, 0, 2, 1, 1, 3, 1, 2, 3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES7 = structure(c(
  2,
  2, 2, 1, 1, 2, 2, 3, 3, 1
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES8 = structure(c(
  0,
  1, 0, 0, 0, 0, 1, 0, 1, 1
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES9 = structure(c(
  0,
  1, 0, 0, 0, 1, 1, 0, 4, 2
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES10 = structure(c(
  4,
  1, 2, 2, 1, 2, 1, 1, 2, 3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double"))), row.names = c(
  NA,
  -10L
), class = c("tbl_df", "tbl", "data.frame"))

My desired example of result would be that:

obraz

Here in this table above everything (columns) sum up to 100% in rowwise way, as arrow shows.
I think I should somehow use pivot_longer maybe ?

I want to do it, in order to prepare a graph like this:

I can do it manually and slowly in SPSS and Excel. Here, my example data are imported form SPSS with labels coded as follows:

obraz

Any help would be greatly appreciated as I want fully transit from SPSS and Excel one day, and this is how I try to learn R.

arthur.t · January 19, 2022, 9:12pm

library(tidyverse)

my_temp <- structure(list(DES1 = structure(c(
  0, 2, 2, 0, 1, 3, 1, 1, 4,
  3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES2 = structure(c(
  2,
  1, 3, 2, 1, 2, 1, 4, 1, 3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES3 = structure(c(
  0,
  1, 0, 1, 1, 2, 4, 0, 4, 1
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES4 = structure(c(
  4,
  3, 0, 1, 2, 2, 2, 0, 3, 3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES5 = structure(c(
  3,
  1, 2, 4, 2, 2, 4, 4, 4, 4
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES6 = structure(c(
  3,
  3, 0, 2, 1, 1, 3, 1, 2, 3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES7 = structure(c(
  2,
  2, 2, 1, 1, 2, 2, 3, 3, 1
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES8 = structure(c(
  0,
  1, 0, 0, 0, 0, 1, 0, 1, 1
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES9 = structure(c(
  0,
  1, 0, 0, 0, 1, 1, 0, 4, 2
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double")), DES10 = structure(c(
  4,
  1, 2, 2, 1, 2, 1, 1, 2, 3
), format.spss = "F1.0", display_width = 12L, labels = c(
  `not applicable` = 0,
  `no stress` = 1, `low stress` = 2, `moderate stress` = 3, `very high stress` = 4
), class = c("haven_labelled", "vctrs_vctr", "double"))), row.names = c(
  NA,
  -10L
), class = c("tbl_df", "tbl", "data.frame"))


counts <- my_temp %>% 
  pivot_longer(everything()) %>% 
  count(name, value) %>% 
  # calculate fractions
  group_by(name) %>% 
  mutate(fraction = n/sum(n)) %>%
  ungroup() %>%
  # manage categories and order of them in the plot
  mutate(
    category = haven::as_factor(value),
    DE = as.numeric(str_remove(name, "DES")),
    name = reorder(name, -DE)
  ) %>%
  ungroup()

print(counts)
#> # A tibble: 39 x 6
#>    name                 value     n fraction category            DE
#>    <fct>            <dbl+lbl> <int>    <dbl> <fct>            <dbl>
#>  1 DES1  0 [not applicable]       2      0.2 not applicable       1
#>  2 DES1  1 [no stress]            3      0.3 no stress            1
#>  3 DES1  2 [low stress]           2      0.2 low stress           1
#>  4 DES1  3 [moderate stress]      2      0.2 moderate stress      1
#>  5 DES1  4 [very high stress]     1      0.1 very high stress     1
#>  6 DES10 1 [no stress]            4      0.4 no stress           10
#>  7 DES10 2 [low stress]           4      0.4 low stress          10
#>  8 DES10 3 [moderate stress]      1      0.1 moderate stress     10
#>  9 DES10 4 [very high stress]     1      0.1 very high stress    10
#> 10 DES2  1 [no stress]            4      0.4 no stress            2
#> # ... with 29 more rows

counts %>%
  ggplot() + 
  aes(name, fraction, fill = category) + 
  geom_bar(stat = "identity") +
  coord_flip()

^{Created on 2022-01-19 by the reprex package (v2.0.1)}

Andrzej · January 19, 2022, 9:20pm

This is absolutely brilliant, thank you.
Would it be possible to add counts and percentages on these bars ?

williaml · January 19, 2022, 9:37pm

Borrowing from the code from @arthur.t:

counts %>%
  ggplot() + 
  aes(name, fraction, fill = category) + 
  geom_bar(stat = "identity") +
  coord_flip() +
  geom_text(aes(label = fraction), size = 3, hjust = 1.5, vjust = 1, position = "stack")

Andrzej · January 19, 2022, 9:44pm

Thank you williaml very much indeed.

One more question if I may, how to achieve this shape of data below with percentages:

obraz

williaml · January 19, 2022, 9:56pm

I just had a quick look, but is this it?

counts %>% 
  mutate(category = factor(category, levels = c("no stress", "low stress", "moderate stress", "very high stress", "not applicable"))) %>% 
  arrange(desc(name), category) %>% 
  pivot_wider(name, names_from = category, values_from = fraction)

# A tibble: 10 × 6
   name  `no stress` `low stress` `moderate stress` `very high stress` `not applicable`
   <fct>       <dbl>        <dbl>             <dbl>              <dbl>            <dbl>
 1 DES1          0.3          0.2               0.2                0.1              0.2
 2 DES2          0.4          0.3               0.2                0.1             NA  
 3 DES3          0.4          0.1              NA                  0.2              0.3
 4 DES4          0.1          0.3               0.3                0.1              0.2
 5 DES5          0.1          0.3               0.1                0.5             NA  
 6 DES6          0.3          0.2               0.4               NA                0.1
 7 DES7          0.3          0.5               0.2               NA               NA  
 8 DES8          0.4         NA                NA                 NA                0.6
 9 DES9          0.3          0.1              NA                  0.1              0.5
10 DES10         0.4          0.4               0.1                0.1             NA

Andrzej · January 19, 2022, 10:19pm

Yes, it is. I am very grateful, thank you.
best

system · January 26, 2022, 10:19pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.