Hi,
I stumbled upon this data:
https://gist.githubusercontent.com/reidhin/353aabcb20eaa1b5fd64b1439ac08364/raw/2c16679ac8b24e345f8ec266ba74e4cf2d91f8e3/data.csv
and it was transformed into this shape:
https://gist.githubusercontent.com/reidhin/80f1f7e04eff5d7b9e192d49084ffe9e/raw/cfff68faff96afaf75c0605ab981bfc255cc3230/quest.csv
This is described here:
https://medium.com/orikami-blog/behind-the-screens-likert-scale-visualization-368557ad72d1
Especially I am interested in that part:
"The downloaded raw data is stored in the so-called ‘long’ data format: the response of each respondent is listed on a separate row. The questions are labelled Q1 to Q20 and the table also stores a total score, gender, age, and the time needed to complete the questionnaire. The first step in working with this data is to convert it into a format that is more convenient to use.
For example, the raw data lists gender as ‘1’ or ‘2’, which can be confusing to work with. We use the codebook published with the dataset to translate integers into the strings ‘male’ and ‘female’, which are easier for humans to understand.
After refactoring some more numeric variables, I have massaged the data table into a more suitable form by removing odd values, aggregating over respondents, and pivoting part of the table. I have also added an age category to the data by splitting the respondents in four groups of approximately equal size. This process results in the table below."
Unfortunately, the author didn't show how he achieved those steps in order to finally get that result:
So far I have tried with pivot_longer() but I have realised that it is more complicated than it looked beforehand.
Any help with that would be greatly appreciated, thank you.