Problem with converting daily stock series into monthly ones

Kuker4e · January 3, 2018, 7:18pm

Hello,

I am a newbie in R and have difficulties converting daily stock series into monthly ones in xts class.

I have a stock data about the ticker IOO downloaded into a .csv (Comma delimited) file from Yahoo!Finance looking like this:

Date	Open	High	Low	Close	Adjusted Close	Volume
12/31/2012	63	63.959999	63	63.959999	56.11607	87900
1/2/2013	64.940002	65.199997	64.620003	65.080002	57.098713	77900
1/3/2013	64.949997	64.980003	64.639999	64.68	56.747776	36000
1/4/2013	64.709999	65.139999	64.639999	65.120003	57.133812	49400
1/7/2013	64.830002	64.919998	64.610001	64.889999	56.932018	102000
1/8/2013	64.650002	64.769997	64.379997	64.580002	56.660034	31600

I have written the following in R in order to read it, convert it to xts and convert daily to monthly:

library(tseries)
library(xts)
library(PerformanceAnalytics)
IOO <- read.csv(file = "IOO.csv", header = TRUE, sep = ",")
IOO <- subset(IOO[, c(1,2,3,4,6)])
colnames(IOO) <- c("Date", "Open", "High", "Low", "Close")
IOO[,"Date"] <- as.Date(IOO[,"Date"], format = "%m / %d / %Y")
IOO <- as.xts(IOO, order.by = as.Date(rownames(IOO), "%Y-%m-%d"), dateFormat = "POSIXct", frequency = NULL, .RECLASS = FALSE)
IOO_monthly <- to.monthly(IOO, indexAt='yearmon', drop.time = TRUE, name = NULL)

Error in to.period(x, "months", indexAt = indexAt, name = name, ...) : 
  unsupported type
IOO_monthly <- to.period(IOO, period = "months", indexAt='yearmon', name = NULL)
Error in to.period(IOO, period = "months", indexAt = "yearmon", name = NULL) : 
  unsupported type
IOO_monthly <- to.period(IOO, period = "months", indexAt= NULL, name = NULL)
Error in to.period(IOO, period = "months", indexAt = NULL, name = NULL) : 
  unsupported type

I have tried many other combinations of arguments in to.period and to.monthly, but it did not work out.

Thank you in advance for your help.

davis · January 3, 2018, 10:39pm

It's difficult to reproduce your problem without access to your csv, but I can give it a shot by just pulling data using quantmod::getSymbols(). I'll also shamelessly show off a way to do this with a package I've been working on for Business Science, tibbletime (version 0.1.0).

The xts code below works immediately, so I would guess that the error you are getting may have to do with whatever ends up inside your IOO object. Again, it would help to have at least a subset of your csv file.

library(xts)
library(quantmod)

IOO <- getSymbols("IOO", auto.assign = FALSE, from = "2012-12-31")
to.monthly(IOO)
#>          IOO.Open IOO.High IOO.Low IOO.Close IOO.Volume IOO.Adjusted
#> Dec 2012    63.00    63.96   63.00     63.96      87900     56.11607
#> Jan 2013    64.94    67.81   64.38     67.23    4730600     58.98505
#> Feb 2013    67.58    67.76   65.33     66.54    3736700     58.37967
#> Mar 2013    66.01    68.60   65.88     67.79     998300     59.47637
#> Apr 2013    67.72    70.39   66.89     70.23    1273200     61.61713
#> May 2013    70.09    73.32   69.64     70.67    1374200     62.00316
#> Jun 2013    70.82    71.67   66.43     67.56    1862400     60.24291
#> Jul 2013    68.05    71.72   67.14     70.87    1452300     63.19442
#> Aug 2013    71.38    71.75   68.74     68.81     796800     61.35752
#> ...

Because xts is finance focused, to.monthly() automatically recalculates the Open/High/Low/Close prices for each month for you. This is different than just taking the first or last row of each month.

Here is another implementation using only tibbles.

First, use tidyquant to get some data.

library(tidyquant)

IOO_tidy <- tq_get("IOO", from = "2012-12-31")

IOO_tidy
#> # A tibble: 1,261 x 7
#>    date        open  high   low close volume adjusted
#>    <date>     <dbl> <dbl> <dbl> <dbl>  <dbl>    <dbl>
#>  1 2012-12-31  63.0  64.0  63.0  64.0  87900     56.1
#>  2 2013-01-02  64.9  65.2  64.6  65.1  77900     57.1
#>  3 2013-01-03  64.9  65.0  64.6  64.7  36000     56.7
#>  4 2013-01-04  64.7  65.1  64.6  65.1  49400     57.1
#>  5 2013-01-07  64.8  64.9  64.6  64.9 102000     56.9
#>  6 2013-01-08  64.7  64.8  64.4  64.6  31600     56.7
#>  7 2013-01-09  64.9  65.1  64.8  65.0  56100     57.0
#>  8 2013-01-10  65.4  65.9  65.3  65.8  81500     57.7
#>  9 2013-01-11  65.9  66.1  65.7  66.1  50200     58.0
#> 10 2013-01-14  65.9  66.1  65.8  66.1  58000     58.0
#> # ... with 1,251 more rows

Next, to do exactly what xts does, convert to a tbl_time object, collapse to monthly and summarise each month:

library(tibbletime)

IOO_tidy %>%
  as_tbl_time(date) %>%
  mutate(date = collapse_index(date, "monthly")) %>%
  group_by(date) %>%
  summarise(
    open   = first(open),
    high   = max(high),
    low    = min(low),
    close  = last(close),
    volume = sum(volume)
  )
#> # A time tibble: 62 x 6
#> # Index: date
#>    date        open  high   low close  volume
#>    <date>     <dbl> <dbl> <dbl> <dbl>   <dbl>
#>  1 2012-12-31  63.0  64.0  63.0  64.0   87900
#>  2 2013-01-31  64.9  67.8  64.4  67.2 4730600
#>  3 2013-02-28  67.6  67.8  65.3  66.5 3736700
#>  4 2013-03-28  66.0  68.6  65.9  67.8  998300
#>  5 2013-04-30  67.7  70.4  66.9  70.2 1273200
#>  6 2013-05-31  70.1  73.3  69.6  70.7 1374200
#>  7 2013-06-28  70.8  71.7  66.4  67.6 1862400
#>  8 2013-07-31  68.1  71.7  67.1  70.9 1452300
#>  9 2013-08-30  71.4  71.8  68.7  68.8  796800
#> 10 2013-09-30  69.6  74.2  69.2  71.9  956800
#> # ... with 52 more rows

Alternatively, if you just wanted to turn it into a monthly series by taking the last row in each month you could do (not a true OHLC calculation):

IOO_tidy %>%
  as_tbl_time(date) %>%
  as_period("monthly", side = "end")
#> # A time tibble: 62 x 7
#> # Index: date
#>    date        open  high   low close volume adjusted
#>    <date>     <dbl> <dbl> <dbl> <dbl>  <dbl>    <dbl>
#>  1 2012-12-31  63.0  64.0  63.0  64.0  87900     56.1
#>  2 2013-01-31  67.3  67.5  67.2  67.2  74000     59.0
#>  3 2013-02-28  66.7  67.0  66.5  66.5  99300     58.4
#>  4 2013-03-28  67.7  68.0  67.6  67.8  40700     59.5
#>  5 2013-04-30  70.2  70.3  69.9  70.2  40900     61.6
#>  6 2013-05-31  71.5  71.6  70.7  70.7  67700     62.0
#>  7 2013-06-28  67.9  68.0  67.5  67.6  97200     60.2
#>  8 2013-07-31  70.8  71.3  70.8  70.9  54000     63.2
#>  9 2013-08-30  69.3  69.3  68.7  68.8  19900     61.4
#> 10 2013-09-30  71.8  72.2  71.8  71.9  43700     64.1
#> # ... with 52 more rows

Kuker4e · January 4, 2018, 4:04pm

Thank you very much, Davis. It works perfectly.
The errors in my codes were most probably due to an old version of R that I used.

Just to ask you which is the data source of the price quotes loaded by 'quantmod' and is that source reliable for most of the international stocks?

Also, may I ask for one more general advise? I want to load in R an investment universe of around 150 international stocks. From them, to find the five best-performing portfolios (everyone containing 20 stocks) in terms of return for the period 2013-2017 under the constraints: a concrete range for the individual weights, a concrete standard deviation of the portfolio, long only, full investment. Is it possible to do this reliably and what approach would you recommend? For example, using for loop or some apply function? I am asking just for a short advice in which direction to head. Thank you.

davis · January 5, 2018, 4:49pm

The source for quantmod in the way I used it is Yahoo Finance. This could render your need to import a Yahoo Finance CSV unnecessary! The data is free, so it isn't perfect, and the API changes a lot, but overall it isn't a bad free source and the quantmod devs work hard to keep things running smoothly.

For your portfolio optimization question, I suggest using PortfolioAnalytics. The package is very smartly designed in that you build up a "portfolio specification" object piece by piece, adding constraints (exactly like the ones you desire) and risk objectives as you go. When you have a full "spec" object, you call optimize.portfolio() to get the optimal portfolio based on your objectives/constraints and data. I've used it before to create an Efficient Frontier, and was able to combine it with the future package + purrr to optimize portfolios in parallel. It can even work with data.frames (although it doesn't play nicely with tibbles directly so there is a bit of overhead if you work in the tidyverse).

Here is a link to the PortfolioAnalytics vignette that should contain everything you need to know to get started. I would read it all the way through to really understand how flexible it is!

Perhaps I'll dig up that Rmd for the Efficient Frontier and turn it into a blog post. Might be useful.

Kuker4e · January 5, 2018, 6:12pm

Thank you again Davis.

I see a possible working plan as follows:

Universe of 150 stocks;
All possible combinations of portfolios consisiting of 20 stocks under the constraints:
Sd of each portfolio = 15 % for example
Individual weight of each stock in each portfolio >= 1 % & <= 10 %
The five portfolios from the above with the highest return

I will start to work on the task soon using PortfolioAnalytics and will ask for help again if I am stuck.