 Translate SAS code to R (Roll model from Hasbrouck's book on Micro structure)

'm not familiar with SAS and is focusing to learn R. I have some SAS code (from Hasbrouck's book on market microstructure), could some one on here be so kind and help me translate this code into R ? This is a SAS macro and I want it to be a R function: Thank you very much !

I originally just tried to run the code straight in SAS and it worked fine with the included data sets. But when I want to use my own data sets i ran into various trouble with SAS and whatnot so I figured it would be much better to have this program in R where I'm more comfortable to learn.

``````*___________________________________________________________________________________________________

GRUnivariate(dsIn=, maOrder=, price=)

macro to estimate a univariate generalized Roll model of prices.

Parameters
dsIn        Input dataset
maOrder     Order of moving average estimated
price       name of price variable (default=p)
The price variable is assumed to be in
level (or log) form. The routine computes the first-differences internally.

____________________________________________________________________________________________________;

%macro GRUnivariate(dsIn=, maOrder=5, price=p);
proc arima data=&dsIn;
identify var=&price(1) center nlag=10;
estimate noint p=0 q=&maOrder;
title "Univariate MA analysis of &price Input dataset=&dsIn maOrder=&maOrder";
ods output parameterEstimates=parameterEstimates;
ods output fitStatistics=fitStatistics;
quit;
run;
title "Univariate random-walk analysis";
proc iml;
start main;
use fitStatistics;
read next var {nValue1} into varEpsilon;
print varEpsilon [label="Innovation variance"];
use parameterEstimates;
read all var {estimate} into theta;
theta = -theta;
rn = char(1:&maOrder);
print (t(theta)) [colname=rn label='Thetas'];
sumTheta = 1+sum(theta);
print sumTheta [label="Sum of thetas, including theta(0)=1" f=50.5];
varW = sumTheta##2 * varEpsilon;
print varW [label="Random-walk variance" f=best30.5];
sdW = sqrt(varW);
print sdW [label="Random-walk standard deviation" f=best30.5];

*   Cumulate sums of thetas;
sCoeff = j(1,&maOrder,0);
do i=1 to &maOrder;
do j=i to &maOrder;
sCoeff[i] = sCoeff[i] + theta[j];
end;
end;
print sCoeff [label="Pricing error coefficients" f=12.5];
sVar = sum( sCoeff##2 ) * varEpsilon;
print sVar [label="Pricing error variance (lower bound)" f=50.10];
sSD = sqrt(sVar);
print sSD [label="Pricing error standard deviation (lower bound)" f=50.6];
finish main;
run;
quit;

run;
%mend GRUnivariate;
``````

It looks like the first part of the SAS code is generating an ARIMA model, which you can do in R with the `arima` function. The `arima` function has an `order` argument, which is a vector containing the AR, I, and MA (autoregressive, integration, and moving average) portions of the ARIMA model specification. `arima` also has a `seasonal` argument for specifying the seasonal portion of the model. Run `?arima` to bring up the help file, which has detailed information on using the `arima` function.

It looks like the second part of the code is providing information on the model coefficients and goodness of fit statistics. The object output by the `arima` function will have the model coefficients and some fit statistics as well, but I don't know if it will have the specific output you're looking for. Try generating an ARIMA model with a built-in data set and you can see what's available. For example, with the built in `lh` time series data set:

``````# Create model
x = arima(lh, order = c(3,0,2))

# Print to the console a summary of the model output
x

# Get model coefficients
coef(x)

# Look at structure of model object returned by arima function
str(x)
``````

There are almost certainly additional R functions for generating additional model diagnostics, and you can also run `tsdiag(x)`, which will output some diagnostic plots of the model residuals.

I'm not familiar with the Roll model and have only basic knowledge of time series analysis, but here and here are an ARIMA tutorial and a free online book about forecasting, respectively, that use R and will hopefully help you get started.

2 Likes

Can you be more specific about the arima model implied by the SAS code? I find it difficult to identify the values of ar, i, ma implied by the SAS code.

I've never used SAS, but looking at the code, the MA order (the `q` of the ARIMA model) is 5, and `p`, which normally refers to the AR order, is zero. The help for `proc arima` seems to confirm this. There's nothing about the `I` term (the "integrated" part of the ARIMA acronym), which refers to how many times the series needs to be differenced to make it stationary, so I assume that term is zero. So, in R it would be `arima(data, order=c(0,0,5), ...other arguments...)`.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.