Calculate the average daily and monthly

Hello, I have this set of data:
date time sediments (g/mL)
01/01/2002 0:00 38.89
01/01/2002 1:00 39.54
01/01/2002 2:00 41.63
...
31/12/2018 23:00 101.22

I would like to calculate the average of the sediments daily and monthly for each year. Can you please gide me through it?

Say, your data would be in a tibble named data and your times are in a column called time:

data %>%
  mutate(time = floor_date(time, "day") %>%
  count(time)

data %>%
  mutate(time = floor_date(time, "month") %>%
  count(time)

floor_date is from the lubridate package.

Thank you for your answer. From this how do I get the daily and monthy average for sediments?

Can you provide a reproducible example?

I crated a reprex with the data you provided:

library(dplyr)
library(lubridate)

data <- tibble(time = c(dmy_hm("01/01/2002 0:00"), dmy_hm("01/01/2002 1:00"),
                      dmy_hm("01/01/2002 2:00"),dmy_hm("31/12/2018 23:00")),
             sediments =  c(38.89, 39.54,  41.63, 101.22))


data %>%
  mutate(day = floor_date(time, "day")) %>%
           group_by(day) %>%
           summarize(avg = mean(sediments))
#> # A tibble: 2 x 2
#>   day                   avg
#>   <dttm>              <dbl>
#> 1 2002-01-01 00:00:00  40.0
#> 2 2018-12-31 00:00:00 101.

Is this what you wanted to achieve? I missed the second column in the data your provided in my first reply.

1 Like

Hi, this is a more accurate example of my data:


...
it goes until 2018 (sediment data collected hourly in a watershed) and the format of my date and time is: day/month/year and hour:minute

When I import this data from excel the date and time are wrong, so I needed to correct that so:

library("lubridate")
library("dplyr")
library("readxl")

fR$date <- as.Date(fR$date) # fR is the name of my data file
head(fR)

fR$sed <- as.numeric(fR$sed)
head(fR)

fR$time <- gsub(x=fR$time, pattern = "1899-12-31", replacement = "", fixed = T)
head(fR)

Then, in order to calculate the daily average for sediments:

fR %>%
mutate(day = floor_date(time, "day")) %>%
group_by(day) %>%
summarize(avg = mean(sed))

But I recieved an error message:

Error in as.POSIXlt.character(x) :
character string is not in a standard unambiguous format
Many thanks

That is not data, that is a screenshot and is not very useful.

Can you please share a small part of the data set in a copy-paste friendly format?

In case you don't know how to do it, there are many options, which include:

  1. If you have stored the data set in some R object, dput function is very handy.

  2. In case the data set is in a spreadsheet, check out the datapasta package. Take a look at this link.

Sorry, I am very new with R.

fR <- data.frame(
stringsAsFactors = FALSE,
date = c("01/01/02",
"01/01/02","01/01/02","01/01/02","01/01/02","01/01/02",
"01/01/02","01/01/02","01/01/02","01/01/02",
"01/01/02","01/01/02","01/01/02","01/01/02",
"01/01/02","01/01/02","01/01/02","01/01/02",
"01/01/02","01/01/02","01/01/02","01/01/02","01/01/02",
"01/01/02","02/01/02","02/01/02","02/01/02",
"02/01/02","02/01/02","02/01/02","02/01/02",
"02/01/02","02/01/02","02/01/02","02/01/02",
"02/01/02","02/01/02","02/01/02","02/01/02","02/01/02",
"02/01/02","02/01/02","02/01/02","02/01/02",
"02/01/02","02/01/02","02/01/02","02/01/02"),
time = c("0:00:00",
"1:00:00","2:00:00","3:00:00","4:00:00","5:00:00",
"6:00:00","7:00:00","8:00:00","9:00:00","10:00:00",
"11:00:00","12:00:00","13:00:00","14:00:00",
"15:00:00","16:00:00","17:00:00","18:00:00",
"19:00:00","20:00:00","21:00:00","22:00:00",
"23:00:00","0:00:00","1:00:00","2:00:00","3:00:00",
"4:00:00","5:00:00","6:00:00","7:00:00","8:00:00",
"9:00:00","10:00:00","11:00:00","12:00:00",
"13:00:00","14:00:00","15:00:00","16:00:00",
"17:00:00","18:00:00","19:00:00","20:00:00",
"21:00:00","22:00:00","23:00:00"),
sed = c(38.89,39.54,41.63,
49.54,56.29,61.49,65.33,68.91,69.25,67.5,
63.89,61.48,60.65,60.77,61.07,61.42,61.89,
62.54,63.28,64.36,65.59,62.21,53.01,50.74,50.83,
50.91,50.38,50.47,52.45,56.76,62.6,68.14,
70.98,74.22,74.52,72.92,67.9,65.07,82.79,
80.05,66.59,57.28,51.49,47.64,44.9,42.72,42.35,
42.45)
)

your date and time are characters, but you need proper date time values. With your provided data, this command will correct the values

data <- fR %>% 
  unite("datetime", date, time, sep = " ") %>% 
  mutate(datetime = dmy_hms(datetime))

data %>% 
  mutate(day = floor_date(datetime, "day")) %>%
  group_by(day) %>%
  summarize(avg = mean(sed))

You should read some pages in the r4ds book written by Hadley Wickham. (https://r4ds.had.co.nz/)

1 Like

unite comes from which package?

tidyr which is part of the tidyverse

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.