Problem with converting daily stock series into monthly ones


#1

Hello,

I am a newbie in R and have difficulties converting daily stock series into monthly ones in xts class.

I have a stock data about the ticker IOO downloaded into a .csv (Comma delimited) file from Yahoo!Finance looking like this:

Date Open High Low Close Adjusted Close Volume
12/31/2012 63 63.959999 63 63.959999 56.11607 87900
1/2/2013 64.940002 65.199997 64.620003 65.080002 57.098713 77900
1/3/2013 64.949997 64.980003 64.639999 64.68 56.747776 36000
1/4/2013 64.709999 65.139999 64.639999 65.120003 57.133812 49400
1/7/2013 64.830002 64.919998 64.610001 64.889999 56.932018 102000
1/8/2013 64.650002 64.769997 64.379997 64.580002 56.660034 31600

I have written the following in R in order to read it, convert it to xts and convert daily to monthly:

library(tseries)
library(xts)
library(PerformanceAnalytics)
IOO <- read.csv(file = "IOO.csv", header = TRUE, sep = ",")
IOO <- subset(IOO[, c(1,2,3,4,6)])
colnames(IOO) <- c("Date", "Open", "High", "Low", "Close")
IOO[,"Date"] <- as.Date(IOO[,"Date"], format = "%m / %d / %Y")
IOO <- as.xts(IOO, order.by = as.Date(rownames(IOO), "%Y-%m-%d"), dateFormat = "POSIXct", frequency = NULL, .RECLASS = FALSE)
IOO_monthly <- to.monthly(IOO, indexAt='yearmon', drop.time = TRUE, name = NULL)
Error in to.period(x, "months", indexAt = indexAt, name = name, ...) : 
  unsupported type
IOO_monthly <- to.period(IOO, period = "months", indexAt='yearmon', name = NULL)
Error in to.period(IOO, period = "months", indexAt = "yearmon", name = NULL) : 
  unsupported type
IOO_monthly <- to.period(IOO, period = "months", indexAt= NULL, name = NULL)
Error in to.period(IOO, period = "months", indexAt = NULL, name = NULL) : 
  unsupported type

I have tried many other combinations of arguments in to.period and to.monthly, but it did not work out.

Thank you in advance for your help.


#2

It’s difficult to reproduce your problem without access to your csv, but I can give it a shot by just pulling data using quantmod::getSymbols(). I’ll also shamelessly show off a way to do this with a package I’ve been working on for Business Science, tibbletime (version 0.1.0).

The xts code below works immediately, so I would guess that the error you are getting may have to do with whatever ends up inside your IOO object. Again, it would help to have at least a subset of your csv file.

library(xts)
library(quantmod)

IOO <- getSymbols("IOO", auto.assign = FALSE, from = "2012-12-31")
to.monthly(IOO)
#>          IOO.Open IOO.High IOO.Low IOO.Close IOO.Volume IOO.Adjusted
#> Dec 2012    63.00    63.96   63.00     63.96      87900     56.11607
#> Jan 2013    64.94    67.81   64.38     67.23    4730600     58.98505
#> Feb 2013    67.58    67.76   65.33     66.54    3736700     58.37967
#> Mar 2013    66.01    68.60   65.88     67.79     998300     59.47637
#> Apr 2013    67.72    70.39   66.89     70.23    1273200     61.61713
#> May 2013    70.09    73.32   69.64     70.67    1374200     62.00316
#> Jun 2013    70.82    71.67   66.43     67.56    1862400     60.24291
#> Jul 2013    68.05    71.72   67.14     70.87    1452300     63.19442
#> Aug 2013    71.38    71.75   68.74     68.81     796800     61.35752
#> ...

Because xts is finance focused, to.monthly() automatically recalculates the Open/High/Low/Close prices for each month for you. This is different than just taking the first or last row of each month.

Here is another implementation using only tibbles.

First, use tidyquant to get some data.

library(tidyquant)

IOO_tidy <- tq_get("IOO", from = "2012-12-31")

IOO_tidy
#> # A tibble: 1,261 x 7
#>    date        open  high   low close volume adjusted
#>    <date>     <dbl> <dbl> <dbl> <dbl>  <dbl>    <dbl>
#>  1 2012-12-31  63.0  64.0  63.0  64.0  87900     56.1
#>  2 2013-01-02  64.9  65.2  64.6  65.1  77900     57.1
#>  3 2013-01-03  64.9  65.0  64.6  64.7  36000     56.7
#>  4 2013-01-04  64.7  65.1  64.6  65.1  49400     57.1
#>  5 2013-01-07  64.8  64.9  64.6  64.9 102000     56.9
#>  6 2013-01-08  64.7  64.8  64.4  64.6  31600     56.7
#>  7 2013-01-09  64.9  65.1  64.8  65.0  56100     57.0
#>  8 2013-01-10  65.4  65.9  65.3  65.8  81500     57.7
#>  9 2013-01-11  65.9  66.1  65.7  66.1  50200     58.0
#> 10 2013-01-14  65.9  66.1  65.8  66.1  58000     58.0
#> # ... with 1,251 more rows

Next, to do exactly what xts does, convert to a tbl_time object, collapse to monthly and summarise each month:

library(tibbletime)

IOO_tidy %>%
  as_tbl_time(date) %>%
  mutate(date = collapse_index(date, "monthly")) %>%
  group_by(date) %>%
  summarise(
    open   = first(open),
    high   = max(high),
    low    = min(low),
    close  = last(close),
    volume = sum(volume)
  )
#> # A time tibble: 62 x 6
#> # Index: date
#>    date        open  high   low close  volume
#>    <date>     <dbl> <dbl> <dbl> <dbl>   <dbl>
#>  1 2012-12-31  63.0  64.0  63.0  64.0   87900
#>  2 2013-01-31  64.9  67.8  64.4  67.2 4730600
#>  3 2013-02-28  67.6  67.8  65.3  66.5 3736700
#>  4 2013-03-28  66.0  68.6  65.9  67.8  998300
#>  5 2013-04-30  67.7  70.4  66.9  70.2 1273200
#>  6 2013-05-31  70.1  73.3  69.6  70.7 1374200
#>  7 2013-06-28  70.8  71.7  66.4  67.6 1862400
#>  8 2013-07-31  68.1  71.7  67.1  70.9 1452300
#>  9 2013-08-30  71.4  71.8  68.7  68.8  796800
#> 10 2013-09-30  69.6  74.2  69.2  71.9  956800
#> # ... with 52 more rows

Alternatively, if you just wanted to turn it into a monthly series by taking the last row in each month you could do (not a true OHLC calculation):

IOO_tidy %>%
  as_tbl_time(date) %>%
  as_period("monthly", side = "end")
#> # A time tibble: 62 x 7
#> # Index: date
#>    date        open  high   low close volume adjusted
#>    <date>     <dbl> <dbl> <dbl> <dbl>  <dbl>    <dbl>
#>  1 2012-12-31  63.0  64.0  63.0  64.0  87900     56.1
#>  2 2013-01-31  67.3  67.5  67.2  67.2  74000     59.0
#>  3 2013-02-28  66.7  67.0  66.5  66.5  99300     58.4
#>  4 2013-03-28  67.7  68.0  67.6  67.8  40700     59.5
#>  5 2013-04-30  70.2  70.3  69.9  70.2  40900     61.6
#>  6 2013-05-31  71.5  71.6  70.7  70.7  67700     62.0
#>  7 2013-06-28  67.9  68.0  67.5  67.6  97200     60.2
#>  8 2013-07-31  70.8  71.3  70.8  70.9  54000     63.2
#>  9 2013-08-30  69.3  69.3  68.7  68.8  19900     61.4
#> 10 2013-09-30  71.8  72.2  71.8  71.9  43700     64.1
#> # ... with 52 more rows

#3

Thank you very much, Davis. It works perfectly.
The errors in my codes were most probably due to an old version of R that I used.

Just to ask you which is the data source of the price quotes loaded by ‘quantmod’ and is that source reliable for most of the international stocks?

Also, may I ask for one more general advise? I want to load in R an investment universe of around 150 international stocks. From them, to find the five best-performing portfolios (everyone containing 20 stocks) in terms of return for the period 2013-2017 under the constraints: a concrete range for the individual weights, a concrete standard deviation of the portfolio, long only, full investment. Is it possible to do this reliably and what approach would you recommend? For example, using for loop or some apply function? I am asking just for a short advice in which direction to head. Thank you.


#4

The source for quantmod in the way I used it is Yahoo Finance. This could render your need to import a Yahoo Finance CSV unnecessary! The data is free, so it isn’t perfect, and the API changes a lot, but overall it isn’t a bad free source and the quantmod devs work hard to keep things running smoothly.

For your portfolio optimization question, I suggest using PortfolioAnalytics. The package is very smartly designed in that you build up a “portfolio specification” object piece by piece, adding constraints (exactly like the ones you desire) and risk objectives as you go. When you have a full “spec” object, you call optimize.portfolio() to get the optimal portfolio based on your objectives/constraints and data. I’ve used it before to create an Efficient Frontier, and was able to combine it with the future package + purrr to optimize portfolios in parallel. It can even work with data.frames (although it doesn’t play nicely with tibbles directly so there is a bit of overhead if you work in the tidyverse).

Here is a link to the PortfolioAnalytics vignette that should contain everything you need to know to get started. I would read it all the way through to really understand how flexible it is!

Perhaps I’ll dig up that Rmd for the Efficient Frontier and turn it into a blog post. Might be useful.


#5

Thank you again Davis.

I see a possible working plan as follows:

  • Universe of 150 stocks;
  • All possible combinations of portfolios consisiting of 20 stocks under the constraints:
  • Sd of each portfolio = 15 % for example
  • Individual weight of each stock in each portfolio >= 1 % & <= 10 %
  • The five portfolios from the above with the highest return

I will start to work on the task soon using PortfolioAnalytics and will ask for help again if I am stuck.