Represent a montly time serie and calculate coefficients

Hello,
I have data over one year (one observation each day) and i want to represent il a plot this time serie by each day of the month.
I do a time serie with a frequency 30.417 with this code
anneetsmois <- ts(JOURS2022TS$Note.Jour, start=c(2022,01,01), freq=30.417)

I do a season plot by day of the month but there are diffrent number of days in months (31 in january, 28 in feburary...) and i have a lag in my season plot.

ggseasonplot(anneetsmois,season.labels = c(1:30),
continuous = FALSE, polar=TRUE) +
xlab("Jour du mois") +
ylab("Notes")+
ggtitle("Notes par mois")+
scale_color_discrete(name= "Mois", labels=c(1:13))+
theme_clean()


**And i have a lag because the data is shifted. For example, the purple line (month 12) should reach a value of 10 for the day n°18 but it's not the case here **

I have calculated the coefficients
decompose(anneetsmois, type = c("additive"))
And i have this coefficients
$figure
[1] 0.49194689 0.43902045 1.85471203 0.39057714 -0.71527575 -1.12075474 -0.50680680 -0.45624596
[9] -1.23582197 -0.88142536 -0.61320663 -0.88593390 -1.15866117 -0.88889725 -1.19945873 0.24928566
[17] -0.79335858 -0.82835246 -1.38875081 0.02245764 0.20278144 -0.16085492 0.15732690 0.93005417
[25] 0.96205930 0.53142392 1.41623086 1.40427582 3.86355368 -0.08190087

But i think is not good because the difference, the gap between days in the time

Can you help me ?

How to do this ?
The main problem is the difference between each month of the year....

It sounds like you are trying to create a season plot that shows the daily data for each month, but the different number of days in each month is causing a lag in the plot. One way to address this issue is to aggregate the daily data into monthly data before creating the season plot. You can do this by using the aggregate() function to group the data by month and take the mean or median of the daily data for each month. This will give you a consistent number of data points for each month, which will eliminate the lag in the season plot.

Another way to address this issue is to create a new time series with the same number of observations for each month, but with missing values for the days that don't exist in certain months. You can use the ts() function to create the new time series, and use the na.locf() function to forward fill the missing values. Once you have the new time series, you can use the ggseasonplot() function to create the season plot as you were before.

You can also try to decompose the time series into trend, seasonal and random component, model the trend and seasonality, and remove them from the original time series. Then you can use the residuals as the daily data for each month, and use the ggseasonplot() function.

1 Like

"Another way to address this issue is to create a new time series with the same number of observations for each month, but with missing values for the days that don't exist in certain months. You can use the ts() function to create the new time series, and use the na.locf() function to forward fill the missing values. Once you have the new time series, you can use the ggseasonplot() function to create the season plot as you were before."

Thanks you, can you explain this more precisely?

Sure, let me break it down for you.

  1. You first need to create a new time series that has the same number of observations for each month. You can do this by using the ts() function. For example, you can create a time series with a frequency of 12 (for 12 months) and a start date of January 2022, like this:

Copy code

anneetsmois_fixed <- ts(JOURS2022TS$Note.Jour, start=c(2022,01,01), freq=12)
  1. The next step is to fill in the missing values for the days that don't exist in certain months. You can use the na.locf() function to forward fill these missing values. The na.locf() function replaces missing values with the last non-missing value. For example, you can fill in the missing values of your time series like this:

Copy code

anneetsmois_fixed <- na.locf(anneetsmois_fixed)
  1. Once you have the new time series with fixed number of observations for each month, you can use the ggseasonplot() function to create the season plot as you were before. You can use the same code you were using before but replace the original time series with the fixed one:

Copy code

ggseasonplot(anneetsmois_fixed,season.labels = c(1:30),
continuous = FALSE, polar=TRUE) +
xlab("Jour du mois") +
ylab("Notes")+
ggtitle("Notes par mois")+
scale_color_discrete(name= "Mois", labels=c(1:13))+
theme_clean()

Keep in mind that if your data has missing values for certain months, it will be a best practice to interpolate them first before doing the above steps.

I hope this helps! Let me know if you have any further questions.

Does not work because i have this error
In ggseasonplot(anneetsmois_fixed, season.labels = c(1:30), continuous = FALSE, :
Provided season.labels have length 30, but 12 are required. Ignoring season.labels

And my plot looks this


I have for each month but i want to have for each day of the months (1 to 30 or 31)

My coefficients are false because of the lag in the lenght of months