Calculate daily and yearly value.

Hi everybody,

I am studying about SPI (stock price informativeness). One of the variable effects on SPI is Volatility.
I am studying how to calculate variable "Volatility" by R


I calculated mean(i,t), however, I still very confused with n and volatility.
I would appreciate any help. Thank you in advance.
My data is like that.

daily<-data.frame(
 X = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
       15L),
 stockcode = as.factor(c(NA, "AAA", "AAA", "AAA", "AAA", "AAA", "AAM", "AAM",
                         "AAM", "AAM", "AAM", "ABC", "ABC", "ABC", "ABC")),
 date = c(NA, 20161128L, 20161129L, 20161130L, 20171201L, 20171202L,
          20151205L, 20151206L, 20151207L, 20161208L, 20161209L,
          20161212L, 20171213L, 20171214L, 20171215L),
 year = c(NA, 2016L, 2016L, 2016L, 2017L, 2017L, 2015L, 2015L, 2015L,
          2016L, 2016L, 2016L, 2017L, 2017L, 2017L),
 return = c(NA, -0.016861186, 0, 0, -0.041676567, 0, 0.003712511,
            -0.003670366, -0.003683887, 0.003683887, -0.041283496,
            -0.043055756, 0, 0, 0),
 mean = c(NA, -0.00562039533333333, -0.00562039533333333,
          -0.00562039533333333, -0.0208382835, -0.0208382835,
          -0.001213914, -0.001213914, -0.001213914, -0.0187998045, -0.0187998045,
          -0.043055756, 0, 0, 0)

)


yearly<-data.frame(
 stockcode = as.factor(c("AAA", "AAA", "AAM", "AAM", "ABC", "ABC", NA)),
 year = c(2016L, 2017L, 2015L, 2016L, 2016L, 2017L, NA),
 mean_return = c(-0.00562039533333333, -0.0208382835, -0.001213914,
                 -0.0187998045, -0.043055756, 0, NA),
 n = c(NA, NA, NA, NA, NA, NA, NA),
 volatility = c(NA, NA, NA, NA, NA, NA, NA)
)

First off, see the lubridate package to convert your date integers into date objects.

Second, let's parse the volatility equation.

Volatility_{i,t} is the variability of a given stock, i in a given year t. Each year, t has some number of days, n that the markets are open for trading. So let's parse the equation from the inside out:

return_{i,k} means the price change of the stock, i, on a given day, k. From that day you subtract the mean average price change of the stock over all days in the year -- +0.50, -.0.25 ... .

You then square the result of all those subtractions and you have (return_{i,k} - mean_{i,t})^2.

You add those all up \sum\limits_{1}^n to get the differences between all the days and the average for the year.

Next, multiply by \frac{1}{n-1}, which will get you the annual daily average, and finally take the square root and you have the equation you posted.

You will need to do this for each stock in your data frame, so once you have expressed the equation as a function, take a look at the purrr package to apply it to each stock.

1 Like

Hi there,

Using a tidyverse implementation (and lubridate as suggested by @technocrat ), this function can be implemented quite elegantly. This is my result:

library("dplyr")
library("lubridate")

daily<-data.frame(
  X = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L,
        15L),
  stockcode = as.factor(c(NA, "AAA", "AAA", "AAA", "AAA", "AAA", "AAM", "AAM",
                          "AAM", "AAM", "AAM", "ABC", "ABC", "ABC", "ABC")),
  date = c(NA, 20161128L, 20161129L, 20161130L, 20171201L, 20171202L,
           20151205L, 20151206L, 20151207L, 20161208L, 20161209L,
           20161212L, 20171213L, 20171214L, 20171215L),
  return = c(NA, -0.016861186, 0, 0, -0.041676567, 0, 0.003712511,
             -0.003670366, -0.003683887, 0.003683887, -0.041283496,
             -0.043055756, 0, 0, 0)
  
)

#Remove cases with missing data, convert the date to interpretable format and extract year
daily = daily %>% filter(complete.cases(.)) %>% mutate(date = ymd(date), year = year(ymd(date))) %>% 
  select(-return, return)

#Summarize the yearly statistics
yearly = daily %>% group_by(year, stockcode) %>% 
  summarise(nDays = n(), 
            meanReturn = mean(return), 
            volatility = sqrt(sum((return - mean(return))^2) / (n() - 1)))
# A tibble: 6 x 5
# Groups:   year [3]
   year stockcode nDays meanReturn volatility
  <dbl> <fct>     <int>      <dbl>      <dbl>
1  2015 AAM           3   -0.00121    0.00427
2  2016 AAA           3   -0.00562    0.00973
3  2016 AAM           2   -0.0188     0.0318 
4  2016 ABC           1   -0.0431   NaN      
5  2017 AAA           2   -0.0208     0.0295 
6  2017 ABC           3    0          0 
  • Notice that I got rid of all variables in your dummy set that can be extracted from others as to minimize the risk of errors.
  • The select(-return, return) might look weird but is just a trick to move that column to the last position
  • There is one NaN in the result because if there is only one day in a year with data, you divide by 0 according to the formula
  • The trick in implementing the formula so tidy is the combination of the group_by and the summarize.

Hope this helps!
PJ

3 Likes

Hi,
Thank you for your feedback.
Always very clear and details.
Best regards,

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.