However, currently I not only have consumption as a variable but two other variables as well (Elabel and Hsize). These two variables are static for each house (every house has a certain Elabel and Hsize). My dataset now looks like this:
Date Temp X1 Elabel1 Hsize1 X2 Elabel2 Hsize2 X3 Elabel3 Hsize3 X4 Elabel4 Hsize4 X5 Elabel5 Hsize5 X6
1 2018-01-01 00:00:00 8.0 0.8093377 A 100 0.9413433 C 90 0.8801952 E 75 0.8984059 A 60 0.7658825 B 120 0.7069353
2 2018-01-01 01:00:00 7.9 0.9300447 A 100 1.1047173 C 90 0.8668304 E 75 0.7173619 A 60 0.8714931 B 120 0.7565921
3 2018-01-01 02:00:00 7.8 0.8468523 A 100 0.9813693 C 90 0.9104481 E 75 0.9178342 A 60 0.8925958 B 120 0.6474878
4 2018-01-01 03:00:00 7.7 1.0558602 A 100 1.3127769 C 90 1.1186276 E 75 1.0786968 A 60 0.9416125 B 120 0.8699044
5 2018-01-01 04:00:00 7.3 1.2765728 A 100 1.6924849 C 90 1.4120881 E 75 1.2462317 A 60 1.1781826 B 120 1.1912390
I'm trying to convert this to long format. Below is an example of how I would like it to look, but I can't seem to achieve this. Would any of you be able to help me?
Date Temp house Elabel Hsize consumption
1 2018-01-01 00:00:00 8.0 X1 A 100 0.6472976
2 2018-01-01 01:00:00 7.9 X1 A 100 0.6806634
3 2018-01-01 02:00:00 7.8 X1 A 100 0.7140293
4 2018-01-01 03:00:00 7.7 X1 A 100 0.8208000
5 2018-01-01 04:00:00 7.3 X1 A 100 1.1110829
Could you also post the code you used to produce your table? That way folks can follow along with what you're trying to do. And are you familiar with the reprex() function?
First you'll need to install the reprex package, then, to see it in action, copy (but don't paste) the following code:
# include any packages you need
library(tidyverse)
my_tibble <-
tibble(a = 1:3, b = letters[1:3])
my_tibble
dput(my_tibble)
Immediately afterwards, run the command reprex(), and note the output rendered in the RStudio 'Viewer' tab: all the commands are shown along with their output.
Now, copy and also paste the rendered output to an R file in your 'Source' pane, and add the following command (which you can obtain by modifying the commented output from the dput() command):
my_dup <- structure(list(a = 1:3, b = c("a", "b", "c")), row.names = c(NA,
-3L), class = c("tbl_df", "tbl", "data.frame"))
my_dup
Finally, copy -- but don't paste -- all of the code you placed in the R file, and run reprex() again, then come back here, reply, and paste to see what happens.
Oh, and P.S., @Leon_bakker: If you follow reprex() example, you can use dput() with your dataset, which is a preferable (and secure) way to share your data with others. Then we can help with your original question more easily.
When I use reprex like you suggested it just provides me with the whole table as output, which comes down to 80000 lines :D. However, using pivot_longer, as @kuriwaki suggested, I have been able to get the desired result.
I've added a separator to all value columns and used the code below.
Here, I gather all value columns --> separate them into seperate measures and a houseid and then use pivot_wider to get the desired columns. I then used arrange() to sort on houseid and Date.
data2 <- data %>%
pivot_longer(
Consumption_1:Hsize_10,
names_to = c("measure", "houseid"),
names_sep = "_"
)%>%
pivot_wider(
names_from = measure,
values_from = value
)%>%
arrange(
houseid, Date
)
That can happen! That's why it's a good idea to pare the table down before using dput(), so that data looks a lot like what you showed in your original post, which you could do with:
data_small <- data %>% slice(1:5)
Then you can run dput(data_small) and post the result so people can reproduce your sample data and work with it themselves.