# Genetic Algorithm in R – Trend Following

This post is going to explain what genetic algorithms are, it will also present R code for performing genetic optimisation.

A genetic algo consists of three things:

1. A gene
2. A fitness function
3. Methods to breed/mate genes

## The Gene

The gene is typically a binary number, each bit in the binary number controls various parts of your trading strategy. The gene below contains 4 sub gene, a stock gene to select what stock to trade, a strategy gene to select what strategy to use, paramA sets a parameter used in your strategy and paramB sets another parameter to use in your strategy.

Gene = [StockGene,StrategyGene,ParamA,ParamB]

Stock Gene
10 IBM

Strategy Gene
0 Simple Moving Average
1 Exponential Moving Average

ParamA Gene – Moving Average 1 Lookback
00 10
01 20
10 30
11 40

ParamB Gene – Moving Average 2 Lookback
00 15
01 25
10 35
11 45

So Gene = [01,1,00,11]

Would be stock=Facebook, strategy=Exponential Moving Average,paramA=10,paramB=45].

The strategy rules are simple, if the moving average(length=paramA) > moving average(length=paramB) then go long, and vice versa.

## The fitness function

A gene is quantified as a good or bad gene using a fitness function. The success of a genetic trading strategy depends heavily upon your choice of fitness function and whether it makes sense with the strategies you intend to use. You will trade each of the strategies outlined by your active genes and then rank them by their fitness. A good starting point would be to use the sharp ratio as the fitness function.

You need to be careful that you apply the fitness function to statistically significant data. For example if you used a mean reverting strategy that might trade once a month (or what ever your retraining window is), then your fitness is determined by 1 or 2 datapoints!!! This will result in poor genetic optimisation (in my code i’ve commented out a mean reversion strategy test for yourself). Typically what happens is your sharpe ratio from 2 datapoints is very very high merely down to luck. You then mark this as a good gene and trade it the next month with terrible results.

## Breeding Genes

With a genetic algo you need to breed genes, for the rest of this post i’ll assume you are breeding once a month. During breeding you take all of the genes in your gene pool and rank them according to the fitness function. You then select the top N genes and breed them (discard all the other genes they’re of no use).

Breeding consists of two parts:

Hybridisation – Take a gene and cut a chunk out of it, you can use whatever random number generator you want to determine the cut locations, swap this chunk with a corresponding chunk from another gene.

Eg.
Old gene: 00110010 and 11100110 (red is the randomly select bits to cut)
New gene: 00100110 and 11110010

You do this for every possible pair of genes in your top N list.

Mutation – After hybridisation go through all your genes and randomly flip the bits with an fixed probability. The mutation prevents your strategy from getting locked into an every shrinking gene pool.

For a more detailed explanation with diagrams please see:

http://blog.equametrics.com/ scroll down to Genetic Algorithms and its Application in Trading

Annualized Sharpe Ratio (Rf=0%) 1.15

On to the code:

?View Code RSPLUS
 ```library("quantmod") library("PerformanceAnalytics") library("zoo")   #INPUTS topNToSelect <- 5 #Top n genes are selected during the mating, these will be mated with each other mutationProb <- 0.05 #A mutation can occur during the mating, this is the probability of a mutation for individual chromes symbolLst <- c("^GDAXI","^FTSE","^GSPC","^NDX","AAPL","ARMH","JPM","GS") #symbolLst <- c("ADN.L","ADM.L","AGK.L","AMEC.L","AAL.L","ANTO.L","ARM.L","ASHM.L","ABF.L","AZN.L","AV.L","BA.L","BARC.L","BG.L","BLT.L","BP.L","BATS.L","BLND.L","BSY.L","BNZL.L","BRBY.L","CSCG.L","CPI.L","CCL.L","CNA.L","CPG.L","CRH.L","CRDA.L","DGE.L","ENRC.L","EXPN.L","FRES.L","GFS.L","GKN.L","GSK.L","HMSO.L","HL.L","HSBA.L","IAP.L","IMI.L","IMT.L","IHG.L","IAG.L","IPR.L","ITRK.L","ITV.L","JMAT.L","KAZ.L","KGF.L","LAND.L","LGEN.L","LLOY.L","EMG.L","MKS.L","MGGT.L","MRW.L","NG.L","NXT.L","OML.L","PSON.L","PFC.L","PRU.L","RRS.L","RB.L","REL.L","RSL.L","REX.L","RIO.L","RR.L","RBS.L","RDSA.L","RSA.L","SAB.L","SGE.L","SBRY.L","SDR.L","SRP.L","SVT.L","SHP.L","SN.L","SMIN.L","SSE.L","STAN.L","SL.L","TATE.L","TSCO.L","TLW.L","ULVR.L","UU.L","VED.L","VOD.L","WEIR.L","WTB.L","WOS.L","WPP.L","XTA.L")   #END INPUTS       #Stock gene stockGeneLength <- 3 #8stocks #stockGeneLength<-6 #Allows 2^6 stocks (64)   #Strategy gene strateyGeneLength<-2   #Paramter lookback gene parameterLookbackGeneLength<-6   #Calculate the length of our chromozone, chromozone=[gene1,gene2,gene3...] chromozoneLength <- stockGeneLength+strateyGeneLength+parameterLookbackGeneLength   #TradingStrategies signalMACross <- function(mktdata, paramA, paramB, avgFunc=SMA){ signal = avgFunc(mktdata,n=paramA)/avgFunc(mktdata,n=paramB) signal[is.na(signal)] <- 0 signal <- (signal>1)*1 #converts bools into ints signal[signal==0] <- (-1) return (signal) }   signalBollingerReversion <- function(mktdata, paramA, paramB){ avg <- SMA(mktdata,paramB) std <- 1*rollapply(mktdata, paramB,sd,align="right") shortSignal <- (mktdata > avg+std)*-1 longSignal <- (mktdata < avg-std)*1 signal <- shortSignal+longSignal signal[is.na(signal)]<-0 return (signal) }   signalRSIOverBoughtOrSold <- function(mktdata, paramA, paramB){ upperLim <- min(60*(1+paramB/100),90) lowerLim <- max(40*(1-paramB/100),10) rsisignal <- RSI(mktdata,paramB) signal <- ((rsisignal>upperLim)*-1)+((rsisignal0)*1)/length(tradingRet) #% of trades profitable #tradingFitness <- -1*maxDrawdown(tradingRet) return(tradingFitness) }   #This function performs the mating between two chromozones genetricMating <- function(chromozoneFitness,useTopNPerformers,mutationProb){ selectTopNPerformers <- function(chromozoneFitness,useTopNPerformers){ #Ranks the chromozones by their fitness and select the topNPerformers orderedChromozones <- order(chromozoneFitness[,"Fitness"],decreasing=TRUE) orderedChromozones <- chromozoneFitness[orderedChromozones,]   ##Often there are lots of overlapping strategies with the same fitness ##We should filter by unique fitness to stop the overweighting of lucky high fitness orderedChromozones <- subset(orderedChromozones, !duplicated(Fitness))   print(orderedChromozones) return(orderedChromozones[seq(1,min(nrow(orderedChromozones),useTopNPerformers)),]) }   hybridize <- function(topChromozones,mutationProb){ crossoverFunc <- function(chromeA,chromeB){   chromeA <- chromeA[,!colnames(chromeA) %in% c("Fitness")] chromeB <- chromeB[,!colnames(chromeB) %in% c("Fitness")]   #Takes a number of chromes from B and swaps them in to A nCross <- runif(min=0,max=ncol(chromeA)-1,1) #the number of individual chromes to swap swapStartLocation = round(runif(min=1,max=ncol(chromeA),1)) swapLocations <- seq(swapStartLocation,swapStartLocation+nCross) #Can run over the end of our vector, need to wrap around back to start swapLocations <- swapLocations %% ncol(chromeA)+1 #Performs the wrapping chromeA[1,swapLocations] <- chromeB[1,swapLocations] #Performs the swap return (chromeA) }   mutateFunc <- function(chrome,mutationProb){ return((round(runif(min=0,max=1,ncol(chrome))