Backtesting high-frequency trading strategies is always a subject of discussion. It differs from other types of backtesting because you are dealing with tick data (the smallest increment in price) and you need to take into consideration problems like network latencies and venues (or ECNs) behaviors.
Backtesting, usually, is one of the most important steps in developing a trading system. It is achieved by re-creating trades using existing data. This can provide you with statistics to help add an informed decision to the process. A backtest is the process of testing the trading rules (for when to buy and when to exit). In other words, you execute the strategy, with historical market price data, and find out with some level of certainty how the strategy has performed in the past.
The process can be broken down into two steps. The first step is to simulate the trading system and the second step is to compare the simulated trades with the trades that actually happened in reality.
Backtesting vs Walk-Forward
Backtesting your strategy can give you an idea of what happened in the past, but it’s more an understanding of ‘what could have been’. In fact, walk forward simulations allow for predictions about the future. The model then processes data from outside of the time period to gain insight into how those optimal settings will perform.
Why walk forward? To avoid curve-fitting. Overfitting is a frequent problem in trading. It appears where you create a trading system that adapts too closely to the noise in historical data. This means that it becomes ineffective for predictions and decisions for future trades
So, the walk forward optimization will force us to verify that we are adjusting our strategy parameters to signals in the past by constantly testing our optimized parameters in out-of-sample data.
Challenges of backtesting in the FX market
As most of us know, the FX market is a very fragmented, with prices that could vary from venue to venue for the same pair. That’s why it is an OTC (over-the-counter) market. That means that the historical data could (and will) vary depending on the venue we are working on, leading to different results from one to another. And that leads directly to having a non-deterministic result, which is not good when trying to decide between one test and another. And finally, when backtesting with quality data, you will likely be using data from different sources (ECNs), and that will not only increase the storage size but also the processing power for all that data. All these are very well-known challenges when dealing with forex, but the biggest challenge will be when backtesting HFT market data, and we will discuss it next.
Challenges of backtesting with high-frequency-trading systems
First, let us define what kind of strategies are executed with high-frequency-trading systems.
Every HFT strategy relies heavily on market microstructure. That is how the limit-order book reacts to incoming orders and trades. This is also called LOB dynamics. Most of the time, modeling the LOB dynamic is a key part of any HFT strategy, and that means that we need to look at how liquidity takers and the overall market will send orders for immediate execution (market orders, and matched limit orders). And that is the most complex part of having an HFT strategy backtest. You have to simulate how the market changes the dynamic of the LOB, and how “your” order needs to interact with the full depth of the limit order book.
To be more specific, let’s say we are trying to backtest a market-making strategy. The main challenges associated with market making strategies are how to go about testing them. You need to answer the tough question of what happens once you submit your orders to this replayed environment. How the backtesting system will determine whether these orders should be filled or not (and perhaps when they should be filled). As we already know, for resting orders, we will need to look at what position in the queue our order is in the limit order book (LOB).
And of course, how, and when takers are taking liquidity (so our order could get executed). Market makers must wait for the market to provide them with opportunities before they can fill orders. Market takers submit their trades if they find the market maker’s offer attractive enough, and then they send the “taker” order, which in that case could interact with our order. This problem is absent because higher timeframes rely on OHLC datasets, making the process so much easier. But with high-frequency-trading strategies, all these are big challenges. Many market-maker professionals say that it is impossible to simulate a real scenario and see how that strategy performs. And they are right, to some degree.
Risk Metrics when backtesting HFT strategies
As strategies in a High Frequency Trading (HFT) firm’s portfolio hold positions for short periods of time, traditional Value at Risk modeling may not fully capture the risks that are faced by traders when the strategy is deployed. Hence, the increased risk associated with high frequency trading is obvious and some of the most important risks include: Model Risk, Liquidity Risk, Strategy Risk, Inventory Risk, Operational Risk, Execution Risk and Information risk. Also, it is worth highlighting that these Risk metrics described here are not only essential for backtesting, but also for “real-time” monitoring. Managing risk with high-frequency trading operations must be carefully monitored and have systems able to react as quickly as possible.
Defined as the risk incurred due to incorrect specifications of the model parameters, incorrect model application, implementation errors and programming errors. Data deficiencies can also lead to a significant degree of prejudice in terms of accuracy during forecasting models.
Understandably, there are many different applications that are used with risk measurements. For example, option models or fixed income derivatives.
Market changes can negatively affect the value of certain assets. Tracking these changes in your portfolio is as important as examining and managing P&L sometimes. Hence, we measure market risk in terms of the profit and loss (P&L) of a strategy. The risk measures we use are Value-at-Risk (VaR) and expected shortfall (ES). The “expected shortfall” that in some cases is also referred to as Conditional VaR (CVaR) or tail loss VaR.
The way we measure volatility is to find out how much the price moves in a given time. And as you know, we are interested in volatility over small periods of time (intraday). We should use the GARCH(1,1) model because it uses both ARCH and GARCH terms to measure variances.
We normally hear about how “news moves markets,” since events can introduce unexpected changes in the fundamentals of assets pricing models, causing a change in comparative prices.
We need to take both endogenous and exogenous liquidity risk into account and then incorporate it into VaR functions. The exogenous component’s risk is attributed to external events that happen even during periods of calm market conditions, while the endogenous component is due to the behaviors of traders – such as their position size. The endogenous component of a credit portfolio is another term for hidden liquidity risk.
Algorithm trading is important for calculating P&L and risk management- but it becomes even more important when dealing with HFT. You will be dealing with different venues where some of them may be having the controversial “last look”, so your orders could easily be rejected on a regular basis. There are lots of different factors that affect the execution of trades. Volatility is a big one: with changing prices, you need to reassess whether your trade is worth it. Bid-ask spreads also play a role: if the ask price is substantially higher than the bid price, it may not be worth entering into that trade. High frequency trading is about speed, so in this study we’re evaluating how risky it is to other factors. These are discussed below. The most common form of execution risk is slippage. Slippage happens when the market price has moved between starting an order and executing it. As a result, the order is executed at a price that was not desired. Order slippage is influenced by two factors: liquidity and volatility of the market price. One other factor that incurs execution risk is the wider spread and price movement based on the trade. As large size trades incur more market impact, the execution risks become higher. The trade size factor is something to consider. It can influence your trading risk.
Inventory risk plays an important role in market-making strategy in HFT. A market-making strategy is one in which the trader buys and sells assets by offering the asset at firm quotes.
The bid-ask spread is the compensation received by the market-maker, in turn he provides liquidity to the market. Market risk occurs when the quoted prices of Market Makers do not match the price of the market and inventory risk happens when there are more buying orders than selling ones.
This type of risk is represented by the maximum drawdown. Drawdown is defined as a drop in the net asset value compared to its previous maximum. Traders can minimize the drawdown by following trading strategies that create risk-averse portfolios. A key method to consider is diversifying across different instruments and maximizing profitability throughout down spells in the market.
Operational risks can be controlled by putting in place controls such as dealing with the fragility of information system connectivity & IT infrastructure. A company has to measure operational risk and make adjustments in the following areas:
- Pre trade and post risk controls
- Volatility monitors
- Maximum Order Size
- Price Limits
- Cancel on Disconnect
- Drop Copies
Backtesting statistical metrics
The result of any backtesting system offers statistical behaviors of the portfolio being evaluated. These metrics are used to gauge the effectiveness of the strategy. Most of them are well-known, and we will list them here, but we will make a special mention of those metrics specific to high-frequency trading in the forex market.
Equity Curve: an equity curve is a visual or graphical representation of how your trading account has grown, and the change in the value over a time period. From this representation, you can easily see its performance.
Compound Annual Growth Rate (CAGR): is the rate of return that your strategy will generate in a year.
Sharpe Ratio: the Sharpe ratio is a measure of how much return the strategy makes on its given risk.
Sortino Ratio: this is a modified Sharpe ratio penalizing only losses.
Maximum Drawdown: (we explained this above)
Value-At-Risk (Var): (we explained this above)
Profit Factor: it is the gross profit of your total trades divided by the gross loss (including commissions) for the entire trading period.
Net Profit: you calculate this by subtracting the gross loss of your trades (including
commissions) from the gross profit of all winning trades. Average PNL Trades: the average of losing and winning trades.
Here is an example of a backtesting HFT strategy in the FX market (values are not real and they are just used as an example):
And then, let’s explain those metrics specifically for high-frequency trading and the forex market, that will help you to understand how the strategies will perform.
Order-to-trade ratio (OTR)
The order-to-trade ratio metric calculates the total number of order messages sent to any given venue divided by the number of trades/executions at that venue. It identifies how effective the strategy is when sending those orders. The goal of the metric can be tweaked and focused by examining the max OTR in 5-minute (or any other arbitrary value) buckets or monitoring for bursts in OTR throughout the trading day.
The cancellation rate metric will tell you how many orders your strategy sent has been cancelled. Cancelling or amending an order is expensive: you are always losing your queue position in the order book, hence missing your order to be executed. Even though this could be a seek behavior for your strategy, it could give you big insides about how it is behaving.
The fill ratio is calculated dividing executed orders by rejected orders. This metric is especially important in the forex market when sending aggressive orders. It will tell you how many of those orders are being rejected by the venues. It is one the key metrics to truly validate the execution quality of each venue.
Of course, there could be many more metrics, and the most important thing to keep in mind is that all these metrics “must” be used in real-time as well once the strategy is deployed for real trading.
Real time monitoring & Surveillance techniques
An increase in high-frequency related incidences has shaken investor confidence and raised global concerns about market stability and integrity. The ‘Flash Crash’ in 2010 showed high-frequency strategies can have rogue tendencies in the right market conditions. As you may know, it is paramount to react fast when something is going wrong, otherwise it could lead you into a catastrophic scenario. So, real-time monitoring & surveillance is as important as being profitable.
I always suggest having humans for as long as the daily operation is running, with special tools and analytics to do this job, because they can have a better understanding of the current scenarios, and always have the “panic” switch off in hand for those cases that are required.
These special tools and analytics need to monitor all the metrics we talked about here, in real-time, and being closed integrated with the Risk Management modules.
I mentioned having an exchange simulator is the best way to backtest HFT strategies in forex, and this also applies to any type of market where you use high-frequency data, and market microstructure usage (resting orders, limit order book dynamics, etc). This is even more crucial if you are backtesting market-making strategies. We need to model a matching engine and the Limit Order Book.
Before I continue with this, I want to mention that most people tend to test this with “demo” accounts provided by their ECNs or data providers. I cannot emphasize how this is such a rookie mistake. Executions on these “demo” accounts are random and usually far different from the real market. Moreover, remember that in FX, we may have liquidity providers that use “last look”. This will not happen in “demo” accounts.
So, in order to build your own exchange simulator, you must fully understand the dynamics of the limit order book, and how different order would affect it. If you send an order, and there is no match, how that order will remain (resting) in the limit order book until it can be executed?
Your simulator must model the following:
- Market takers sending market orders (this must execute immediately and have the right market impact)
- Market takers sending limit orders (it may or may not execute immediately)
- Other liquidity providers sending resting orders (it will rest in the LOB until a match exists), how they cancel them or replace them.
- Be able to identify your orders and know their position in the queue between all others
Usually we call all these “theoretical” participants Agents. Hence, your simulator must have as many “Agents” as possible to get closer to a real ECN.
As you can see this could get as complicated as it can be. Tons of research papers have been published about the limit order book dynamic and its participations, and you will find many different approaches to implement them. Moreover, even if you can build such simulator, you will never be able to match the real behaviors.
In practical terms, this could take so much time and so many special resources to build it that firms tend to overlook it. And that is the main reason so many firms choose to backtest their HFT forex strategies with live account, using real money. Of course, they will use smaller sizes, until they get comfortable.
As shown, backtesting HFT strategies in forex is extremely difficult but so necessary. There is no workaround to this. The main take away from this is that you need to understand very well what your strategy is doing and how will interact with ECNs limit order book. Understand “very well” LOB dynamics and market impact. And make sure to execute them either live or using a simulator. Lastly, consider all the described metrics and risks, making sure they can be used when monitoring in real-time.