Investigation into the Power of Co-integration / mean reversion tests

The term statistical arbitrage (stat-arb) encompasses a wide variety of investment strategies that typically aim to exploit a statistical equilibrium relationship between two or more securities. The general principal is that any divergence from the equilibrium is a temporary effect and that bets should be placed on the process reverting to it’s equilibrium.

The major caveat of stat-arb /pairs trading type strategies is that as the divergence from equilibrium grows the trade becomes more desirable, however at some point the divergence will grow so large that one has to concede that the equilibrium relationship no longer exists / the model is broken. Naturally it is desirable to estimate the power of the statistical tools used to determine these relationships and to asses the duration of any observed equilibrium out of sample.

This post will investigate the power of the statistical tests in relation to pairs-trading for the following statistical tests ADF, BVR, HURST, PP, PGFF,
JO-T and JO-E.

The general principal is that for two stocks X and Y they form a stationary, and by definition, mean reverting pair if the following equation holds:

Y_{t} = \hat{\alpha}+\hat{\beta}X_{t}+R_{t}
R_{t} = \hat{\rho}R_{t-1}+\epsilon_{t}
\epsilon_{t} \sim i.i.d.N(0,\sigma^{2})

If \hat{\rho} is between -1 and 1 then X and Y are co-integrated, \rho
is the coefficient of mean reversion. A statistical test must be performed to check if \hat{\rho}=1, this is known as a unit root test. If the series contains a unit root it isn’t suitable for pairs trading. There are multiple unit root tests, each running a different test on the residual process. One might be tempted to estimate the AR(1) residual model and check for \rho=1 using the conventional linear regression method calculating the standard t-ratio. However it was shown by Dicky and Fuller [1979] that the t-ratio does not follow the t-distribution, hence non-standard significance tests are needed known as unit root tests.

As with every model there are trade off when determining the training window size, too long a window and the model may contain irrelevant data and be slow to adjust to recent events, too short a window and the model only responds to recent events and forgets about past events quickly. This trade off is problematic in co-integration testing, it was demonstrated in Clegg, M., January 2014. On the persistence of cointegration in pairs trading that for a fixed window size the power of most unit root tests decrease as as \rho
tends to 1 from below, for 250 data points with \rho=0.96
the barrage of co-integration tests only detect co-integration less than 25% of the time!

Intuitively this makes sense, the slower the process is to revert the more data points will be needed to see the reversion. It is somewhat undesirable that the power of the unit root tests vary depending upon the properties of the underlying process, however it is not required for successful pairs trading that all co-integrated pairs are identified as such the varying power property of unit root tests is largely irrelevant.

What is more interesting is the false positive rate, so pairs identified as mean reverting when they are not, and how persistent the results are.

Accuracy Test

Generate 1000 co-integrated time series with \alpha and \beta uniformly distributed in the set [-10,10],  and \rho in the set [0.800,0.825,0.850,0.875,0.900,0.925,0.950,0.975] according to Clegg this is similar to the types of stock pairs encountered in reality. Repeat this for different lengths of time series and test to see how many time series get correctly classified as co-integrated / mean reverting using various tests for different pValues.

cointegration test power

In the majority of tests PP and PGFF outperform the other methods. When the process was strongly reverting with \rho less than 0.85 the tests PP, PGFF, JO-E and JO-T correctly identified the process as co-integrated / mean reverting more than 75% of the time at pValue 0.01. For some of the weaker reverting pairs with \rho greater than 0.95 the performance of the statistical tests is woeful with just 250 data points.

It is worth bearing in mind that 250 data points is approximatlythe number of trading days in a year, and perhaps gives an indication of how much historical data is needed in a pairs trading strategy.

False Positive Tests

Follow the same procedure outlined for the accuracy test but chose \rho in the set [1.001,1.026,1.051,1.076,1.101,1.126,1.151,1.176] to generate time series that isn’t co-integrated. See what percentage of the paths get falsely reported as co-integrated / mean reverting.

cointegration test power false positive

I’ve never seen this chart in a text book and was surprised at the results, both HURST and BVR report more false positives as \rho increases! The more the process explodes the more likely the test was to show a false positive!

Thankfully the other tests behave in a reasonable fashion with few false positives.

Onto the code:

?View Code RSPLUS
library("egcm") #Has lots of co-integration tests
library("ggplot2") #Used for plotting data
 
 
 set.seed(1) #Makes the random number generator start from the same point for everyone so hopefully creates the same results
 
assertTrueFunc <- function(testExpression, messageIfNotTrue){
  #Utility method to ensure parameters into other functions are correct
  if(!testExpression){
    cat(paste("Assertion Fail:",messageIfNotTrue,"\n"))
    stop()
  }
}
 
createTimeSeriesForRho <- function(rho,nDataPoint){
	#The library egcm contains a function rcoint that generates co-integrated data:
	# function (n, alpha = runif(1, -10, 10), beta = runif(1, -10, 10), rho = runif(1, 0, 1), sd_eps = 1, sd_delta = 1, X0 = 0, Y0 = 0) 
	return (rcoint(nDataPoint,rho=rho))	
}
 
createTimeSeriesForRhoAndTestForCointegration <- function(rhos, pValues, testMethods, nDataPoints,nMonteCarlo){
	combinations <- length(rhos)*length(pValues)*length(testMethods)*length(nDataPoints)*nMonteCarlo
	results <- data.frame(Rho=numeric(combinations),PValue=numeric(combinations),NDataPoint=numeric(combinations),Test=character(combinations),IsCointegrated=numeric(combinations),stringsAsFactors=F)
	counter <- 1
	for(nDataPoint in nDataPoints){
		for(rho in rhos){			
			print(paste("Progress:", round(100*counter/combinations,2),"%"))
			for(nMc in seq(1,nMonteCarlo)){
				simData <- createTimeSeriesForRho(rho,nDataPoint)			
				for(testMethod in testMethods){
					for(pValue in pValues){
 
 
						isCointegrated <- 0
						try({
						e <- egcm(simData,p.value=pValue,urtest=testMethod)
						isCointegrated<-is.cointegrated(e)*1
 
						},silent=TRUE)
						results[counter,]<-c(rho,pValue,nDataPoint,testMethod,isCointegrated)
						counter <- counter + 1
					}
				}
			}
		}	
	}
	return (results)
}
 
testPowerForCointegratedSeries <- function(rhos, pValues, testMethods, nDataPoints,nMonteCarlo){
    assertTrueFunc(max(rhos)<1,"Rho must be less than 1")
	assertTrueFunc(min(rhos)>-1,"Rho must be greater than -1")
	assertTrueFunc(max(pValues)<1,"pValue must be less than 1")	
	assertTrueFunc(min(pValues)>0,"pValue must be greater than 0")
	assertTrueFunc(min(nDataPoints)>0,"nDataPoint must be greater than 0")
	assertTrueFunc(nMonteCarlo>0,"nMonteCarlo must be greater than 0")
 
	results <- createTimeSeriesForRhoAndTestForCointegration(rhos,pValues,testMethods,nDataPoints,nMonteCarlo)
	results <- aggregate(IsCointegrated~Rho+PValue+NDataPoint+Test,data=results,FUN=function(x){100*mean(as.numeric(x))})
	colnames(results) <- c("Rho","PValue","NDataPoint","Test","AccuracyPct")
	return (results)
}
 
testFalsePositiveRateForNoneCointegratedSeries <- function(rhos, pValues, testMethods, nDataPoints,nMonteCarlo){
    assertTrueFunc(min(rhos)>1,"Rho must be greater than 1")	
	assertTrueFunc(max(pValues)<1,"pValue must be less than 1")	
	assertTrueFunc(min(pValues)>0,"pValue must be greater than 0")
	assertTrueFunc(min(nDataPoints)>0,"nDataPoint must be greater than 0")
	assertTrueFunc(nMonteCarlo>0,"nMonteCarlo must be greater than 0")
 
	results <- createTimeSeriesForRhoAndTestForCointegration(rhos,pValues,testMethods,nDataPoints,nMonteCarlo)
	results <- aggregate(IsCointegrated~Rho+PValue+NDataPoint+Test,data=results,FUN=function(x){100*mean(as.numeric(x))})
	colnames(results) <- c("Rho","PValue","NDataPoint","Test","FalsePositiveRatePct")
	return (results)
}
 
nMC <- 1000
 
rhosCoint <- seq(0.8,0.999,0.025)
rhosFalsePositive <- seq(1.001,1.2,0.025)
nDataPoints <- c(250,500,750,1000)
pValues <- c(0.01,0.05, 0.1)
urTests <- c("adf","bvr","hurst","jo-e","jo-t","pp","pgff")
 
tt <- testPowerForCointegratedSeries(rhosCoint,pValues,urTests,nDataPoints,nMC)
dev.new()
tt$NDataPoints <- factor(tt$NDataPoint,levels=nDataPoints) #Creating a factor to get the chart order in same order as nDataPoints variable
ggplot(tt, aes(Rho,AccuracyPct,color=Test))+
		geom_line(aes(group=Test))+
		facet_grid(PValue ~ NDataPoints,labeller = label_both)+
		ggtitle(paste("Co-integration test accuracy for",nMC,"time series simulations"))+ 
		theme(plot.title = element_text(lineheight=.8, face="bold"))
 
qq <- testFalsePositiveRateForNoneCointegratedSeries(rhosFalsePositive,pValues,urTests,nDataPoints,nMC)
 
dev.new()
qq$NDataPoints <- factor(qq$NDataPoint,levels=nDataPoints) #Creating a factor to get the chart order in same order as nDataPoints variable
ggplot(qq, aes(Rho,FalsePositiveRatePct,color=Test))+
		geom_line(aes(group=Test))+
		facet_grid(PValue ~ NDataPoints,labeller = label_both)+
		ggtitle(paste("Co-integration test false positive rate for",nMC,"time series simulations"))+ 
		theme(plot.title = element_text(lineheight=.8, face="bold"))

One thought on “Investigation into the Power of Co-integration / mean reversion tests

  1. Pingback: Quantocracy's Daily Wrap for 01/23/2017 | Quantocracy

Leave a Reply

Your email address will not be published. Required fields are marked *