Transforming daily data in wide format to long

dplyr
tidyverse

#1

Hi all,

I have a meteorological data for 50 years and the format of one-year data is as below:

And I want to align all the three variables in long format as shown below as an example:

Any kind of suggestion will be helpful and highly appreciated.

Regards
John


#2

Hi,

Can you please take couple of rows in your dataset and convert them into reprex?

Second, take a look at readxl package. Since you have composite headers (each month is one header and then each variable is another), you first need to read it in properly so that it will make it easier to work with.

Once this is done, converting your initial table to the format you want is straightforward. It'll be a combination of gather, separate and spread from tidyr package.

Very similar example from that package with solution is available as a demo here - https://github.com/tidyverse/tidyr/blob/master/demo/so-9684671.R.


#3

Hi,
Thanks for your suggestions. Here is my sample dataset as generated using reprex.

set.seed(10)
dat <- data.frame(
       Max.1 = runif(10),
       Min.1 = runif(10),
       RF.1 = runif(10),
       Max.2 = runif(10),
       Min.2 = runif(10),
       RF.2 = runif(10),
       Max.3 = runif(10),
       Min.3 = runif(10),
       RF.3 = runif(10),
       Max.4 = runif(10),
       Min.4 = runif(10),
       RF.4= runif(10),
       Max.5 = runif(10),
       Min.5 = runif(10),
       RF.5 = runif(10),
       Max.6 = runif(10),
       Min.6 = runif(10),
       RF.6 = runif(10),
       Max.7 = runif(10),
       Min.7 = runif(10),
       RF.7 = runif(10),
       Max.8 = runif(10),
       Min.8 = runif(10),
       RF.8 = runif(10),
       Max.9 = runif(10),
       Min.9 = runif(10),
       RF.9 = runif(10),
       Max.10 = runif(10),
       Min.10 = runif(10),
       RF.10 = runif(10),
       Max.11 = runif(10),
       Min.11 = runif(10),
       RF.11 = runif(10),
       Max.12 = runif(10),
       Min.12 = runif(10),
       RF.12 = runif(10))

Regards
john


#4

You can use the gather/spread functions.

See:

http://garrettgman.github.io/tidying/


#5

Hi,
Thanks for your suggestion! I will try it and post my response accordingly.
I tried to transform my data from wide format to long using reshape package using the following code.

 reshape(dat, 
             varying=c(Max= c(seq(1,34,3)), 
                       Min= c(seq(2,35,3)), 
                       RF= c(seq(3,36,3))), 
             direction="long")

But the problem for me now is how I will incorporate the month aspect in my code as the number of days for each month changes??

Thanks again!

Regards
John