converting table to case form

lucasj_lm · July 17, 2020, 11:53pm

Hi all! I'm trying to convert from table form with frequencies to case form with all individual data points. I was using expand.dft but it seems to just get rid of the frequency column altogether, rather than expanding it. Any ideas?

phil_hummel · July 18, 2020, 4:40am

Can you do a summary() and head() on Frequency just after you load it from the Excel sheet?

lucasj_lm · July 18, 2020, 6:59am

nirgrahamuk · July 18, 2020, 8:18am

What would a frequency of 13.2 mean in this context?
What would it benefit you in your workflow to uncount rather than to work with weighted data?

lucasj_lm · July 18, 2020, 9:33am

Frequency is the number of people (in thousands). So I was hoping to create a dataset that had every person listed individually. I tried making them whole numbers, but unfortunately that didn't work either.

nirgrahamuk · July 18, 2020, 9:57am

I think you need to realise that there's no benefit to doing that.
If I tell you I have a dataset with an entry with frequency of two, and average height of 6ft.
Would you unpack that to a two row table with anything other than 6 in the height fields.
There's no way for you to know/restore that for one person their height was 6ft 1 inch and the other was 5ft 11inch. That information was lost when it was summarised

lucasj_lm · July 18, 2020, 1:02pm

These are not averages though - my aim is to plot the % chance someone has lost their job against the the % chance they have tertiary education. So if I could expand out each individual, I could try to see whether there is a correlation between the two variables. But at the moment, each industry is weighted equally despite the number of people in that industry, which means the correlation would be distorted.

nirgrahamuk · July 18, 2020, 1:27pm

I believe there are several R packages that offer weighted correlation.
Otherwise multiple the frequency so that the result is non decimal I.e. some number of full rows. And dplyr uncount can be used

lucasj_lm · July 19, 2020, 9:28am

Thank you, I'll try that!

system · August 9, 2020, 9:28am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.