First, let's generate some example data:
library(dplyr)
set.seed(1)
my_obs <- expand.grid(time = 1:24,
day = 1:365,
airport = LETTERS[1:4]) %>%
bind_cols(rain = round(rnorm(24*365*4, 70, 20), 2),
temp = round(runif(24*365*4, -5, 35), 2))
head(my_obs)
# time day airport rain temp
# 1 1 1 A 57.47 20.45
# 2 2 1 A 73.67 32.96
# 3 3 1 A 53.29 -3.73
# 4 4 1 A 101.91 28.28
# 5 5 1 A 76.59 -1.39
# 6 6 1 A 53.59 21.84
Now, if you're using dplyr, it is quite easy to obtain the kind of summary that you want using filter(), group_by(), and summarize(). For example:
my_obs %>%
filter(day == 5) %>%
group_by(airport) %>%
summarize(mean_rain = mean(rain),
min_temp = min(temp))
# A tibble: 4 x 3
# airport mean_rain min_temp
# <fct> <dbl> <dbl>
# 1 A 68.9 -4.71
# 2 B 65.7 -4.65
# 3 C 81.7 -4.6
# 4 D 73.3 -3.8
You can easily put that in a function selecting the day and station and computing a predefined statistic:
my_func <- function(d, s){
my_obs %>%
filter(day == d, airport == s) %>%
summarize(summ = mean(rain))
}
The difficulty is to use dplyr functions with parameters, for that you should read up this guide. In your case, it's quite simple, you just need to "embrace" the attribute variable, but you can pass the function as is:
my_func <- function(d, s, a, f){
my_obs %>%
filter(day == d, airport == s) %>%
summarize(summ = f({{a}}))
}
my_func(5, "A", rain, mean)
# summ
# 1 68.94