# Statistical Arbitrage – Trading a cointegrated pair

In my last post http://gekkoquant.com/2012/12/17/statistical-arbitrage-testing-for-cointegration-augmented-dicky-fuller/ I demonstrated cointegration, a mathematical test to identify stationary pairs where the spread by definition must be mean reverting.

In this post I intend to show how to trade a cointegrated pair and will continue analysing Royal Dutch Shell A vs B shares (we know they’re cointegrated from my last post). Trading a cointegrated pair is straight forward, we know the mean and variance of the spread, we know that those values are constant. The entry point for a stat arb is to simply look for a large deviation away from the mean.

A basic strategy is:

• If spread(t) >= Mean Spread + 2*Standard Deviation then go Short
• If spread(t) <= Mean Spread – 2*Standard Deviation then go Long
There are many variations of this strategy
Moving average / moving standard deviation (this will be explored later):
• If spread(t) >= nDay Moving Average + 2*nDay Rolling Standard deviation then go Short
• If spread(t) <= nDay Moving Average – 2*nDay Rolling Standard deviation then go long
Wait for mean reversion:
• Advantage is that we only trade when we see the mean reversion, where as the other models are hoping for mean reversion on a large deviation from the mean (is the spread blowing up?)
All the above strategies look to exit their position when the spread has reverted to the mean. Personally I wouldn’t trade any of the above as they don’t specify an exit strategy for adverse trades. Ie if there is a 6 standard deviation move in the spread is this an amazing trade opportunity? OR more likely did the spread just blow up.

This post will look at the moving average and rolling standard deviation model for Royal Dutch Shell A vs B shares, it will use the hedge ratio found in the last post.  Sharpe Ratio Shell A & B Stat Arb Shell A
Annualized Sharpe Ratio (Rf=0%):

Shell A&B Stat Arb 0.8224211

Shell A 0.166307

The stat arb has a Superior Sharpe ratio over simply investing in Shell A. At a first glance the sharpe ratio of 0.8 looks disappointing, however since the strategy spends most of it’s time out of the market it will have a low annualized sharpe ratio. To increase the sharpe ratio one can look at trading higher frequencies or have a portfolio pairs so that more time is spent in the market.

Onto the code:

?View Code RSPLUS
 ```library("quantmod") library("PerformanceAnalytics")       backtestStartDate = as.Date("2010-01-02") #Starting date for the backtest   symbolLst<-c("RDS-A","RDS-B") title<-c("Royal Dutch Shell A vs B Shares")   ### SECTION 1 - Download Data & Calculate Returns ### #Download the data symbolData <- new.env() #Make a new environment for quantmod to store data in getSymbols(symbolLst, env = symbolData, src = "yahoo", from = backtestStartDate)   #We know this pair is cointegrated from the tutorial #http://gekkoquant.com/2012/12/17/statistical-arbitrage-testing-for-cointegration-augmented-dicky-fuller/ #The tutorial found the hedge ratio to be 0.9653 stockPair <- list( a = coredata(Cl(eval(parse(text=paste("symbolData\$\"",symbolLst,"\"",sep=""))))) #Stock A ,b = coredata(Cl(eval(parse(text=paste("symbolData\$\"",symbolLst,"\"",sep=""))))) #Stock B ,hedgeRatio = 0.9653 ,name=title)   simulateTrading <- function(stockPair){ #Generate the spread spread <- stockPair\$a - stockPair\$hedgeRatio*stockPair\$b   #Strategy is if the spread is greater than +/- nStd standard deviations of it's rolling 'lookback' day standard deviation #Then go long or short accordingly lookback <- 90 #look back 90 days nStd <- 1.5 #Number of standard deviations from the mean to trigger a trade   movingAvg = rollmean(spread,lookback, na.pad=TRUE) #Moving average movingStd = rollapply(spread,lookback,sd,align="right", na.pad=TRUE) #Moving standard deviation / bollinger bands   upperThreshold = movingAvg + nStd*movingStd lowerThreshold = movingAvg - nStd*movingStd   aboveUpperBand <- spread>upperThreshold belowLowerBand <- spreadmovingAvg belowMAvg <- spread

## 22 thoughts on “Statistical Arbitrage – Trading a cointegrated pair”

• Enry Cons on said:

Hi Gekko,
it also means that when identified the maximum divergence i can take position in derivatives like options?
-selling ATM Call option on first stock
-buy Call option on the second one

or with a BacKSpreadCall on the first and a BackSpreadPut on the second so I can set the protections and I can roll them if they go out control…
The short positions should be moneyness ATM or lightly OTM in my opinion.
many thanx

enry

1. Ronaldo Zani on said:

Hi,
Did you tried using Johansen’s testing approach in order to perform a more rigorous testing of cointegration? What do you think about combining Engle-Granger with Johansen?
Best,

2. Emmanuel Armah on said:

The spread in the above does not oscillate around it mean ,ideally,a cointegrated pair should trade sideways not in a trending manner as shown above….your write-up was perfect on proper cointegration you demonstrated. but this spread is not a perfect spread.

• GekkoQuant on said:

I 100% agree with you.

However for practical purposes as long as the mean reversion happens faster than the mean changes then you’ll do well.

I guess that’s something I’ve missed, how to quantify the half life/reversion speed.

Please note that in the above demo the look back period is 90days. This is fairly short. Choosing 200 days will result in a mean that is less responsive / changes direction. It will most likely increase the size of the standard deviation bands and result in less trades per year. This usually results in a lower Sharpe ratio.

3. Vick on said:

Very interesting post. Would love to see the implementation on a basket of pairs.

4. Sam on said:

Hello Gekko,

I do some changes in your programme to calculate the bollinger bands and I wanna know why you’re put the Standard deviation to the right? (movingStd = rollapply(spread,lookback,sd,align=”right”, na.pad=TRUE))

• GekkoQuant on said:

Quick example

library(“tseries”)
dat <- c(1,2,3,4,5,6,7,8,9,10) rollapply(dat,3,sum,align='right', na.pad=TRUE) #  NA NA 6 9 12 15 18 21 24 27 rollapply(dat,3,sum,align='left', na.pad=TRUE) #  6 9 12 15 18 21 24 27 NA NA rollapply(dat,3,sum,align='center', na.pad=TRUE) #  NA 6 9 12 15 18 21 24 27 NA In rollapply we set a "lookback" window size (in our case 3), the align attribute tells the function what side of the window sits on the current element in the vector. say the current element is 5 right align sets the right of the window on 5 1,2,[3,4,5],6,7,8,9 left align sets the left of the window on 5 1,2,3,4,[5,6,7],8,9 center align sets the center of the window on 5 1,2,3,[4,5,6],7,8,9 We set the align value to right, to prevent look forward in the code.

• Sam on said:

Your blog give me the chance to implement and build more quickly my stat arb strategy.

I am going to test different models for statistical arbitrage. I keep all the visitors in the loop!

Thanks again!

• Sam on said:

Hello again,

In your program, the martingale effect is not here. How can I add this effect?

I am running my iwn backtests with differents programs (Excel, R et ProRealTime (a french platform)) and in order to do some comparison, I need to add the martingale effect.

• Visitor on said:

Thanks for the clarification. By the same argument, rollmean has to have the same: rollmean(spread,lookback, na.pad=TRUE, align=’right’)
With this new modification the Sharpe ratio drops dramatically ..

5. fader on said:

Hi,

Great stuff!! I think there are two bugs in your code, though. First one is in calculation of moving average. You forgot to set align parameter to “right” (like you do for standard deviation). Function uses default “center” and your data – spread and moving average are not aligned. You can see this from the plot as well. Moving average ends 45 days before the spread. Second bug is in calculation of trading returns. I think you should take return from the next day as we enter the position at the closing price.

Regards

6. Visitor on said:

Thanks for your elegant code. I noticed that your line of code:
shortPositions <- Reduce(shortPositionFunc,-1*aboveUpperBand+belowMAvg,accumulate=TRUE)
is meant to apply the function shortPositionFunc to (-1*aboveUpperBand+belowMAvg).
However, the function shortPositionFunc takes two arguments x and y.
Is there any typo in the code?

7. Sunny on said:

Thanks Gekko for the backtesting code. It is very useful. Couple of comments below:
2) since we enter trades at end of day, the return on trade date shouldn’t count. we can simply shift every element in the “positions” vector down by using the “shift” function in the taRifx library.
Also, I don’t believe daily return is (aRet – stockPair\$hedgeRatio*bRet). Imagine if you had a large hedge ratio, i.e. if stock A is priced at \$100 and stock B is priced at \$10, then the hedgeRatio would be in the neighborhood of 10. Since aRet and bRet are in % terms, the formula won’t work. Daily return should be aRet – bRet * (ratio between dollar neutral ratio vs hedge ratio).
See amendments:
library(taRifx)
aRet <- Delt(t[,1],k=1,type="arithmetic")
bRet <- Delt(t[,2],k=1,type="arithmetic")
dollarNeutralRatio <- stockPair\$a/stockPair\$b
hedgeRatioOVERdollarNeutralRatio <- stockPair\$hedgeRatio/shift(dollarNeutralRatio,-1)
dailyRet <- aRet – bRet*hedgeRatioOVERdollarNeutralRatio
dailyRet[is.na(dailyRet)] <- 0
}

8. Claudio on said:

Hi Gekko,

I am looking for new strategies in equity pair trading that improve the standard cointegration approach (for instance I started looking into the pair trading with copulas, which still seems an “unstable” alternative to cointegration). Do you have any new paper to suggest me? Thank you very much and congrats for the great blog.

Claudio

• GekkoQuant on said:

Hi Claudio,

The second half of the book goes through lots of more advanced techniques for hedging a portfolio / finding stationary pairs.

9. Prabin on said:

HI,
i am a bit confused in this step

when i plotted the longPositions and ShortPositions along with the spread, bands and moving average lines found then there are consecutive long signals and short signals. According to my understanding

longPostions <- if spread is below lowerband
longExit <- if spread is above movAvg while long

shortPostions <- if spread is above upperband
shortExit <- if spread is below movAvg while short

10. Carlos on said:

Hi Gekko, I read the books of EP Chan that talks about this topic and I a little bit confused about mean reservion. When two assets ara cointegrated we are supposing that they will come back to their mean, but their moving average or their total mean in a fixed period? I’m giving better results using static parameters than using bollinger bands. I will show you an image with my doubt. http://prntscr.com/51jofw Could you write another article of mean reversion! Thanks for all

11. Mike on said:

Hi Gekko. Great Code. Could you closer explain an idea behind this cappedCumSum function ? I do not understand the moment when you are specifing two input variables, but in Reduce() function is only one parameter, – is it because of 0?
Cheers,
Mike

12. Nikita on said:

There is a mistake. Your algorithm looks in the future, the problem in rollmean function. Algorithm using moving average from future days to close position.

13. Pingback: Cointegration | QuantSt