Count value when

I have a table with temperatures multiple values every month for 21 years.

df <- data.frame(month=rep(c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), times = 20, each = 3), 
                 year=rep(c("2010", "2011", "2012", "2013", "2014", "2015", "2016", "2017", "2018", "2019", "2020", "2020"), times = 1, each = 3),
                 Temperature=rep(c("4", "5", "6", "9", "10", "12","15", "16", "15", "14", "11", "8"), times = 20, each = 3))

The frame looks like that

df

I have done some analyses. I want to count for how many times the mean temperatures(>8) of each month has been over a threshold on this timespan.
This is what I am trying but I have hit a wall.

countdf <- df |>
  group_by(month) |>
  summarise(Temperature = mean(Temperature))

You are reading in your Temperature* variable as a character variable and you need a numeric.

This should give you a start

dat1 <- data.frame(month=rep(c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), times = 20, each = 3), 
                 year=rep(c("2010", "2011", "2012", "2013", "2014", "2015", "2016", "2017", "2018", "2019", "2020", "2020"), times = 1, each = 3),
                 Temperature=rep(c("4", "5", "6", "9", "10", "12","15", "16", "15", "14", "11", "8"), times = 20, each = 3))

dat1$Temperature <-   as.numeric(dat1$Temperature)

countdf <- dat1 %>% 
  group_by(month) %>% 
  summarise(Temp = mean(Temperature) >8)

countdf

Note I have changed the name of the data.frame as df is an R function.

Thank you for the pointing out the temperature class error.

This calculates the mean temperature of all the years by month and then gives a logical value if >8 is satisfied, right? What I am trying to do is to calculate the mean for every month for each year and then count how many times each month is above a certain threshold. Note that my dataset is different that this reproducible example since months have different means each year.

I think I have a breakthrough

count <- dat1 |>
  group_by(month, year) |>
  summarise(mean_temp = (mean(Temperature)),
            n = sum(mean_temp > 8))

But from each year I get a month. What I need is every month from every year.

Oh, sorry I think I misread

meantemps <- dat1 %>% 
  group_by(month) %>% 
  summarise(Temp = mean(Temperature))

will give you the means.

mm <-   meantemps %>% 
      count(Temp > 8)

mycount <-   mm[2, 2]

Thank you for your inputs. I think I am not very clear. In this first step I want to calculate the mean for every month for every individual year not as a whole.
Example.
2010 - Jan - mean value
2010 - Feb - mean value
2010 - Mar - mean value

until Dec and then to keep going for the next year

2011 - Jan - mean value
2011 - Feb - mean value
2011 - Mar - mean value

After that count how many times the mean value of Jan is > 8

I hope that makes sense. Thank you

Sorry to take so long to get back to you.

I see what you mean but it cannot be done using your sample data set as each year only has data for one month.

Try

table(dat1$month, dat1$year)

and you will see what I mean.

Yes, I was actually my mistake! It was so simple! I made a false reproducible example and I firstly tried my code on it. It took me some time to go back at my original dataset where the table looks normal. However I can't figure out how this can be made into an reproducible example. I would appreciate if you give some feedback on that since it will make future question better for everyone. :slight_smile:

Depends on what you are doing but I'd think that you could just generate a data set and provide in dpdt() format or perhaps a .csv file.

Here is a quick and dirty sample data file that seems to work for me. Of course since it is created using a pseudo-random process it's only okay as a test case

library(tidyverse)
 dat1 <-   data.frame(year =  sample(2010:2021, 500, replace = TRUE),
             month =   sample(month.abb[1:12], 500, replace = TRUE),
             temp =  sample(11:18, 500, replace = TRUE) )
 
 with(dat1, table(year, month)) # check to see if we have data in most cells
 
meantemps <-    dat1 %>% group_by(year, month) %>% 
      summarise(mean = mean(temp), n = n())

meantemps

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.