David Murray
David Murray

Digital intelligence for smarter FX - extracting, decoding and exploiting trading network data

David Murray explores the type of data that FX trading firms and venues can derive from their trading networks, how it can be analysed, deployed and utilised.

The FX landscape, like many other electronic trading environments, is increasingly resembling an atom in an excited state.  It is characterized by higher frequency of activity, accelerating speed of information flows, increasing data volumes and fragmentation across a wide range of venues and platforms.  For the technologically proficient, these trends may provide great opportunity, provided they can quickly make sense of and adapt to what is happening as part of their trade execution.

The combined forces of growing competitiveness, information sources, and algorithmic complexity are creating an ever-increasing level of opaqueness for the largely autonomous trading environments. In this environment, the ability to analyze and react to data is paramount.   Technology performance and trading outcomes are intrinsically linked and the costs of disruption or degradation are high.  Those who can harness and effectively analyze data to optimize performance and execution will continue to see a meaningful advantage.

A recent JPMorgan survey, on e-trading trends shows institutional FX traders place Best Execution Requirements and Efficiency of Process behind Availability of Liquidity as their top daily issues.  The survey also revealed the top four most “important criteria when selecting a liquidity source” are: Price consistency, Availability during volatile markets, Execution quality, and Response Times.

With macro issues such as global economic and geopolitical uncertainty also fueling concerns, there is a compelling need for faster, more effective data analysis, execution decisioning, and short time scale adaptivity.

Decisions and analysis in motion

Because of the intrinsic link between trading performance and technology efficiency and the need to make faster decisions in a fragmented landscape, correlating trade outcomes with technology performance at increasingly granular levels is critical.  

One of the few sources that provides timely and rich data, considers availability of systems and infrastructure, and allows for correlation of response/performance is the trading network.  Traversing the “excited” FX ecosystem, is a continuous river of data flowing between investors, banks, venues, etc., and “there is gold in them there waters!”

Across the network flows every streaming price, every Request for Quote (RFQ) and quote, every interaction with every counterparty for every currency pair as well as the outcome of the interaction. Network data also includes, uniquely, the response times and technology performance across every connection and through every “hop” of the trading, hedging and pricing apparatus.

Incredibly voluminous and historically unwieldy, network data is rich, real-time, immutable, and enables correlation of latency and payload (i.e., client and order content/context).  With precision timestamping and clock-synchronization, it also provides a high-fidelity, sequenced data source for not only performance and business context, but for forensic purposes.

Traditionally, network packets have been used by IT engineers and operators to discover and troubleshoot the health and performance of the network.  However, advances in real-time processing, analytics, and machine learning have established the network as a valuable place to glean business performance insight.

The information inside the machines that now perform the vast majority of FX trading, has its own subsets of descriptive, annotative and qualitative data that allow firms to assess its place inside the system.  This is not log file data or software agent data (two other “machine data” sources traditionally thought of in an IT context); this is data that lets us know what information is traversing the network, how well it is getting to where it is supposed to be going, and to what the data itself pertains at a higher level.

Unravelling and understanding network fabric data necessitates that we first connect to the network itself to obtain a live, passive “parallel stream” of information as it is traversing the network.  This can be achieved via a variety of functions that are standard practice in today’s IT infrastructure (e.g., optical taps, network switch span ports, and/or lightweight virtual software agents).  A benefit of network data is that it can be harnessed, captured, and analyzed in real-time without impacting the performance of the live trading systems - a particular concern in all high performance electronic markets.

Once the data has been obtained, it has to be sorted.  Among more than 100 FX venues and platforms, there are a variety of communications protocols in use, though many are derivative of various versions of the Financial Information eXchange (FIX) protocol.  Today’s technologies allow these different network “languages” to be identified, translated (or “decoded”), for trading sessions to be discovered automatically and the data to be captured, processed and analyzed in real-time.

With execution quality of significant concern in faster markers as well as trying to balance factors of availability, response time and price consistency, many traditional data sources are problematic.

Variety of information

Across the trade lifecycle, trading firms and venues today can derive a variety of information from their networks such as:

Price data: Including both inbound and outbound.  This includes each price received (or distributed), whether there are gaps in pricing streams, and, in many cases, the performance or currency of those streams from the source or performance of those being distributed.  If received prices are slow or incomplete, so will follow the pricing, hedging and quoting that is dependent on them.

Pricing (or hedging) performance:
This may include the time or latency from receiving a price as a dealer to outputting a quote or offering a price.  This may reveal applications or infrastructure that may be optimized by examining the delay in each component of the system.  If this process takes too long, there is a high likelihood of offering a stale price or pursuing a suboptimal hedge.

RFQs: This includes all of the details of the RFQs, including, potentially, the client (possibly even trader ID), currency pairs, quote type, etc.

Quotes: This includes the details of the quote being offered, band, instrument, type, etc. as well as the response time and number of quotes issued against an RFQ.  This is important to understand if the counterparty is taking prices in an effort to arbitrage them against others.

Transaction(s): This would include the currency pair, price, client, band, venue, etc.  It may also include the quote to order latency to understand how counterparties respond to quotes.

Obtaining this type of information in real-time, regardless of source, is a challenge for many organizations.  Once the data has been obtained and normalized, the next challenge is extracting actionable insight.  

This can be a big data big problem.  Advances in big data technologies, streaming analytics, and machine learning are increasingly playing a role in both trading decisions, execution analysis and decisions and counterparty analysis.

Incorporating this data into a real-time or near-real-time data cube enables advanced multidimensional analyses and trending, such as spreads by venue by order type or hit rates by client by instrument or orders by RFQ response times by client by instrument.  These types of analyses may reveal opportunities for price flexibility, where toxicity may lie, and where technology performance may be impacting execution.  Linking the data cube to full message raw data storage can also provide the capability to drill down into individual orders/transactions to diagnose the source of issues or respond to client or trader inquiries.

Rather than teams each drawing conclusions from disparate data sources, network-derived data may provide a common source for use by multiple business stakeholders, such as technology/operations, trade support, ecommerce/execution teams, as well as compliance, improving business and operational alignment.  

This granular, normalized, precision-timestamped data from the network provides a strong foundational dataset for applying machine learning to identify anomalies in the performance of clients and venues/counterparties.  It also can be a path to the digital intelligence needed to improve performance, agility and trade outcomes for today’s FX business.  

Machine Learning

For optimal results with minimal noise, some leverage machine learning to correlate multiple vectors of anomalies.  Examining anomalies in one dimension (e.g., client notional) may be helpful, but correlating concurrent anomalies in notional, latency, rejects, etc. may be more instructive in providing early warning of relevant changes in behaviors (or systems) and enable adjustments in venue selection, pricing, or client relationship management.  

In today’s competitive and volatile FX landscape, gaining and rapidly actioning insight into execution quality and efficiency are key to gaining a trading performance advantage. Harnessing the power of the atom more than 80 years ago opened the door to many new possibilities. Harnessing the power of AI applied to data flowing across the network, may provide unique intelligence for navigating and competing in FX and other high performance electronic markets.