Nicholas Hastings
Nicholas Hastings

Sharpening the tools of the trade: New developments with FX Strategy back testing platforms

Back testing foreign exchange trading strategies has always been more difficult than it has been for equities, futures and options not only because of the difficulty in obtaining appropriate historical market data from the FX market but because of the sheer scale of the amount of data that needs to be processed. Nicholas Hastings investigates.

First Published: e-Forex Magazine 72 / Trading Operations / July 2016

The difficulty in getting price transparency given the high level of fragmented liquidity in FX coming from numerous global sources and the high cost of the infrastructure needed to help optimize trading strategies continue to produce new challenges for trading firms seeking to use back testing as a way of checking whether strategies that have worked at producing profits in the past will work once again at producing profits in the future.  
The arrival of new technology, the creation of new platforms that can conduct back testing on a large scale, the use of the Cloud as a means of storing data and the launch of new hosted service providers for multiple customers are all helping the industry to advance and ensure that back testing as means of controlling risk is becoming more successful at producing profits than it was in the past.

Challenges

Perhaps the largest but not by any means the only challenge for foreign exchange back testing is quality control – the ability to generate clean reliable accurate and appropriate data. “Without this, back testing cannot provide the improvement in trade decisions and reduce the fears of an algo “going rogue””, warns Louis Lovas, Director of Solutions at OneMarketData LLC.  The problem is that fragmented liquidity with foreign exchange being traded in numerous centres across the globe and across time zones. Not only that but there are multiple sources of liquidity from hundreds of banks, numerous electronic communication networks working across geographic locations as well as dealer platforms all spewing out often client-specific bid and offer spreads. The difficulty for trading firms is then to record the specific market data they receive from their production trading environment.

“Firms need to record the specific market data they receive from their production trading environment and not every firm yet has the ability to do this, so crucial market data is lost,” says Ilya Gorelik, Founder and CEO of Deltix, a leading provider of software and services for quantitative research, algorithmic and automated systematic trading. 

There is also the problem that because of fragmented liquidity the venue that is offering the best bid or offer is constantly changing. According to the experts there are different levels of bids and offers at each venue so there is an aggregated order book across the different venues accessed by any one specific firm.

“Specific to the FX market is the difficulty in attaining price transparency due to fragmented liquidity and redundant quoting across banks and ECN’s and the general lack of trade data,” OneMarketData’s Lovas points out.

Development
•  Programming skills
•  Market knowledge
•  Libraries
•  Versioning
Testing / Back Testing
•  Historical Data
•  Tick vs. Bar Data
•  Processing Time
•  Spreads
•  Executions
•  Commissions
•  Slippage
•  Reporting
•  Vaildation (walk-forward)
•  Paper Trading
Live Trading
•  Connectivity
•  Data Validation
•  Data Persistence
•  Execution System
•  Reconciliation
•  Fail-over / Redundancy
•  Backups
•  Notification Tools
•  Account Management
•  Reporting
Challenges

For signal generation algos this aggregation of multiple liquidity pools is an important function but having made a buy or sell decision the algo then needs to implement smart order routing (SOR) in order to achieve best execution. This in itself provides another challenge for a trading firm seeking to back test. So all in all, this provides a complex multi-dimensional problem for traders and quants seeking to acquire the clean data and price transparency that they need to attain their goals.

New technology makes things easier

New technology is helping to make it easier to build new strategy models in several areas as well as providing access to large scale back testing that might not have been available to some firms in the past. The aggregation of personalized market data has provided greater accuracy in itself. But for an even more accurate model verification it is often necessary to use not only the top of the book bid/ask data but also the full depth of market data, according to Deltix’s Gorelik. This extraordinary amount of data often exceeds multiple gigabytes per day.

Ilya Gorelik

Ilya Gorelik

“Firms need to record the specific market data they receive from their production trading environment and not every firm yet has the ability to do this, so crucial market data is lost,”

“To digest this amount of data, the back testing tools should support cloud-based distributed computing,” he states.

These affordable cloud-based solutions have already made a visible impact on model creation and back testing. Additional innovation has come in the form of new trading models and execution algorithms that add new data types, such as news, weather, social media, and new mathematical methods and techniques, such as machine learning and optimization.

For many firms, however, perhaps the greatest improvement in access to back testing has come from the creation of hosted infrastructures that that can make use of public and/or private cloud to achieve the needed computational elasticity that scale up from just a few cores of data to thousands.

“Cloud technologies and on-demand services for both compute power, market history and exchange simulation now allow quants to do the sorts of multi-year back testing needed to ensure confidence in model designs across boom and bust markets,” Lovas pointed out.

This means that firms that previously did not have access to adequate server hardware and deep market history can now use vendors that can amortize the cost among multiple customers. What was once impossible for many in terms of large-scale back testing is now achievable.

Range of functionality

Back testing is an experimental and interactive process with a lot of moving parts aimed at measuring robustness and profitability. These include application program interfaces - a set of routines, protocols and tools for building software applications - aimed at varying an optimizing strategy parameters. These include ways of comparing strategy instances using pre-defined and customized statistical measures for profit and loss as well as risk. Exchange simulator matching engines can be used to produce orders and get simulated trade executions. Also, customized dashboards can be used to plot or chart the profit and loss or risk or other results and provide a means to adjust and measure market impact.

Cloud-based solutions have already made a visible  impact on model creation and back testing.
Cloud-based solutions have already made a visible impact on model creation and back testing.

“Quantifying and qualifying algo logic is to measure robustness and profitability. Back testing is the software tool to do that,” Lovas said.

According to some experts, the most important and often most overlooked component in functionality is the time-series data warehouse, which tends to be “hidden”. Gorelik explained why the rate of processing is so critical.

“More experimentation and testing is more likely to result in successful trading strategies. As such, the rate of processing is critical in enabling back testing to run sufficiently quickly not to discourage sufficient experimentation.”

As a result, the data delivery mechanism is central with delivery rates measured minimally at several hundred thousand “messages” including trades, quotes or bars.The success of a platform is also enhanced if it is event-based, that is strategies should be written at the level of real world events, such as new trade, new best bid and the new level in an order book. This differentiates it from more traditional development environments which rely on regular representation of these real world events. Typically, this would be in the form of time-based “bars” of pre-defined duration.

Another new functionality that boosts the success of a platform is the creation of a rich set of libraries with key mathematic and time-series functions. 

An example of this is that it is non-trivial to compute a moving least-squares regression on tick data. But, with pre-built functions, a user can leverage such critical pre-written functions and focus on the key aspects of strategy development.

Finally, key new technologies include a comprehensive set of reports showing key performance characteristics, such as Sharpe, Information and Sortino ratios and drawdown measures as well as trade-based performance measures.

Hadoop

Given the need to employ and process such large quantities of data, back testing specialists often turn to Hadoop. “Hadoop is sometimes used to implement back testing of very large (multiple terabyte) data sets,” Gorelik said.

Louis Lovas

Louis Lovas

“There are lots of moving parts in back testing technologies – including the development and design of tools to code trade analytics, track profit and loss, stimulate order flow and dashboards to chart and easily interpret results.”

It is an open source software framework for storing and running applications on clusters of commodity hardware and provides massive storage for any kind of data, enormous processing power and the ability to handle almost limitless concurrent tasks or jobs.
The open-source software is created and maintained by a network of developers across the globe and is free to download use and contribute to even though more and more commercial versions are becoming available.

However, as Hadoop is based on the premise of having multiple nodes of providing computing and storage services this can create complications as data distributed across these nodes for back testing has to be processed independently from the data on other nodes. Specialists point out that depending on the models being back tested that may or may not be possible.

In the meantime, OneMarketData, a leader in tick data management and analytics, has launched OneTick Strategy Backtesting, a platform for strategy development and large scale back testing. The platform, which was launched in April this year, utilizes global multi-asset class market history available through the OneTick Cloud service.

“OneTick offers a fully featured strategy development and large-scale back testing platform for the most discerning quant,” according to Lovas.

The platform aims to provide a product that will allow them to evaluate alpha or execution strategy logic against controlled market replay; vary and optimize strategy parameters; compare strategy instances using pre-defined and customized statistical measures; exchange simulator matching engines; ensure robustness of strategy logic as well as customize dashboards to lot and chart results. (A recorded webinar broadcast is available at https://www.onetick.com/backtesting-webinar-request)

•   Test and production environments must be as similar as possible
•   Expect issues with brokers and data providers
•   Be prepared to handle all use-cases, especially the unlikely ones
•   Proper user training and user manual documentation
•   Treat the paper trading phase as real money trading
•   Have email alerts in place
•   Handle partial fills, limited liquidity, slippage, etc. appropriately
•   Expect misunderstandings between traders, quants and developers
•   Expect budget overflows
•   Expect bugs that only show up in live trading
•   Start simple and add complexity only if needed

Lessons Learned

Final part of the puzzle

Choosing a suitable back testing platform provider is very important and factors that might influence the choice are myriad. For a start, the on-demand service needs to keep the barrier to entry low, given the high costs of acquiring market history and IT resources such as staffing and hardware. It also needs to offer secure hosting of computer power that allows the means to leverage multiple cores - in the hundreds or even thousands – for performing large-scale back testing so that a firm can easily spot the best performing strategies. 

Lovas explains it thus: A suitable back testing platform provider needs to provide the “means to easily ‘discover the diamond in a mountain of coal’ - sifting through back testing results to find the best performing (optimized) strategies and strategy parameters.”

Critically, a platform provider not only needs speed but the flexibility and expressive power to enable quants to address the most intricate details during their research and back testing. According to Gorelik, this will give them the ability to work with “multiple data sources, including news, cross-asset classes and custom events, and provide capabilities for visualization and reporting.”