We are truly living in the data age. More than 90% of the world’s data was created in the last two years with current production levels at somewhere near 2.5 quintillion bytes of data per day. It is also likely that 2017 will produce more data than the previous 5,000 years.
But too much of anything can be unhelpful and the sheer volume and variety of data available to all trading participants is dizzying. Back in the late 1990s there were serious concerns about data shortage and the capacity to safely hold data. Then virtualisation and the cloud essentially eradicated any of those concerns and made data an endless and limitless commodity and this has brought its own challenges.
For example, how do you avoid wasteful and excessive data consumption? How do you ensure you are not bingeing on data? For some data management specialists, this is a genuine concern. Data is increasingly available, the market is increasingly competitive and data is seen as the key to competing, ergo the bingeing will continue.
Excessive consumption can take up both time and money if firms do not have a data management strategy that covers a proper data warehouse, a dedicated data manager and proper communication between the technical experts pulling data from the warehouse to the analysts examining the data for insights to the portfolio managers that ultimately have to put that data into a trading context.
Technology has developed to the point where artificial intelligence, machine learning, virtualisation and predictive analytics can help firms create a continuous loop of data – using pre-trade information to make post-trade decisions.
As ever though, the potential problems lie with human behaviour and that an all-you-can-eat approach to data capture and consumption will leave market participants swamped by their own appetite.
“There are many steps that sell sides and buy sides can take to use their data strategically,” says David Vincent, founder and chief executive of trading software provider smartTrade Technologies, “and guidance is very often necessary to help firms extract the most value from their data”.
“Regulations such as MiFID II were milestones that actually pushed financial institutions to assess their data and start thinking of ways to organize it to extract useful analytics from a compliance point of view. But with the right technology, firms can bring their analytics to the next level with things such as machine learning.”
“The first step is to centralize and normalize all the data. Our clients usually handle multiple sources of data: they receive flow from various liquidity providers and distribute data to all their end-clients. All this data is received and sent to various marketplaces using many different protocols. Therefore, data normalization and centralization is complex and requires permanent dedicated resources. We manage over 100 connections that we normalize on a single API. And address the decentralized marketplaces issue thanks to hosting solutions, we collocate our client in the main FX hubs in New York, London and Tokyo,” says Vincent.
“The next step is to establish an infrastructure scalable enough to handle this huge volume of data as well as the various levels of security needed to protect that data,” says Vincent. “Also it is not just a case of getting a data warehouse and leaving it to operate on its own. You need to tune the servers, enhance and maintain the infrastructure, store all the data in a continuous flow and to structure it to be able to query it efficiently.”
“One important challenge of data management is around security,” says Vincent. “To guarantee our clients the usage of safe standardized procedures when handling their data, we have successfully undergone the SOC2 type 2 audit.”
“Technology has become increasingly critical to successful information extraction,” says Vincent. “Now, with data virtualization or Business Intelligence tools, you are able to produce reports based on real-time data. You are no longer restricted to pre-made reports. Data management is more user-friendly, traders themselves can set up their own graphs and change them on the fly.”
New technology designed for big data analytics as well as ‘fast’ data analytics for trading applications has been embraced by providers like smartTrade. “In September last year, we launched smartAnalytics, a cross asset big data analytics solution,” says Vincent. “With smartAnalytics, financial firms will achieve greater control and transparency by leveraging our cutting-edge solution to store, analyse and visualize all the data flowing through their trading infrastructure. Data is stored in a secure high performance, fully hosted and managed environment. smartAnalytics enables users to easily generate graphical reports and perform analysis on demand. The solution gives a 360 degrees view of their data and covers multiple reporting requirements including regulation and compliance such as MiFID II, Risk, Transaction Cost Analysis (TCA), performance analysis and much more. For example, with smartAnalytics, buy-side clients can analyze their implicit costs such as liquidity fill ratios, slippage ratios, rejection rates, market impact, and latency which will give them their real trading costs.”
“The prevalence of new, more efficient and specific service models such as FX analytics, has made it more likely that trading firms will seek third party help when it comes to finding solutions for trade execution analysis, post trade analysis, liquidity analysis and other tasks,” says Vincent.
“The entry level and maintenance costs for an effective FX Analytics platform is high. This is why financial firms decide to outsource rather than develop their own solutions. We have invested heavily in our infrastructure to allow our clients to enter the big data, analytics, and machine learning space without them having to make that investment. But the benefits of having the tools to analyze massive amounts of data can help our clients spot patterns and correlations in their trading and can ultimately lead to providing them with a competitive edge,” he says.
Quandl is a financial data platform based in Toronto that differentiates itself in two ways, according to chief executive and founder Tammer Kamel. “We focus on alternative data, the stuff that Wall Street has never seen, and we structure it so it can be consumed by financial analysts. And we employ all the contemporary tools that the next generation of data analysts use, such as Matlab, Python and so on.”
Kamel believes that we are in a new era of data-centricity when it comes to financial trading and recognition of this change is crucial to establishing an effective and enterprise-wide data strategy. “Data is becoming more essential to the buy-side’s success in the markets. There has never been more data and more technology to analyse it – from artificial intelligence to machine learning. At the highest level of management there has to be a recognition of this fact.”
That recognition means investing in the right technology but also the right people, says Kamel. “Data analysis is different today than it was before and you need to hire the staff that have the right skills.”
These skills are a combination of scientific discipline, statistical skills and computer programming ability – the types of skills that enable them to take 500 gigabits of data and turn it into useful information, and to recognise the various pitfalls of data analysis such as spurious correlations.
Kamel says that the most progressive hedge funds and trading firms are looking primarily for these type of skills rather than experience of financial markets, based on the logic that it is easier to teach a full skilled data analyst about the financial markets than it is to teach a seasoned trader or financial professional about the discipline of data science.
And there is also a new breed of hybrid investors emerging, says Kamel – the so-called quantamental investors who are augmenting their traditional fundamentals-based strategies with quantitative techniques in an attempt to do better than what either approach can do on its own.
For FX the biggest data-related challenge is the opaqueness and general lack of visibility that comes with a decentralised and largely OTC market, says Kamel. Although this could be about to change, he says, courtesy of a new partnership struck between Quandl and CLS Bank, the utility that provides settlement for roughly half of the FX transactions in the market.
Through the arrangement, Quandl will be able to provide trading firms with a reliable source of near real-time transaction volume in the FX market, something that has been absent until now, says Kamel. “Trading volume is taken for granted in other asset classes and if you are trading without that information, you are at a disadvantage.”
Trading volume has strong relationship with price, the idea being that heavy trading volume lends some stability to currency prices, whereas low volume is an indication of likely volatility. The large market-making banks that still provide much of the market’s liquidity and are effectively systematic internalisers have been able to rely on their own internal data as an indication of market volume but that same data has never been available on the buy-side, says Kamel.
“This arrangement will be an equalising force and give smaller participants a chance to trade on a level playing field with the big banks and sell-side firms.”
Quandl is essentially a data marketplace, says Kamel. “We work with partners to take their data and sell it to someone else for a commission. The value-add is that we do all the tedious chores that need to be done - the capture, cleansing, formatting and documenting – so that it can be used by financial professionals. A growing number of firms have realised that their data is a commodity and if they can find some way to monetise their data, to sell that data to Wall Street, then it is essentially free money for them.”
However, with the data-centric nature of today’s market and a thirst for all things ‘data’, there is a danger of an information overload –the idea that market participants are bingeing on data and creating an endless flow of unnecessary data that is greedily consumed.
This is where the discipline of data science will prove so important, says Kamel. “A lot of the data sets we offer are only as powerful as the capabilities of its consumers. Even with all the FX data that we can provide, you still have to devise your trading strategy based on your own risk thresholds.”
The scientific discipline will also help firms to ensure they do not abuse the use of data, says Kamel. “You can get data to tell you anything you want if you torture it enough, so those scientific principles are very important in ensuring you do not use the data to confirm your own bias or prejudice.”
Equally important is the fact that trading firms still have to draw the insight from the data. For example, data that shows the number of construction projects starting in London every day will not show what the effect will be on the FX market. “So while we and others like us may provide the raw material, there is still a lot of work for firms to do themselves.”
The first step that FX trading firms should take in establishing their data strategy is to identify their data and business objectives, says Shaheen Dil, managing director and global solution leader for Data Management and Advanced Analytics for consulting firm Protiviti.
“Think of the firms’ existing data management and analytics challenges and see how they are affecting the firms’ operations and profitability. Think about directions and ways to solve existing data problems and include all parts of the firms and stakeholders in the conversation including business and technical side. Then look for data technologies to support the strategies, exploit patterns and reuse common relationships and services, and identify a pipeline of projects that take you to the objectives identified at the first step,” says Dil.
Having a centralised data marketplace would benefit trading strategies by reducing time and cost but in the absence of such a feature in the FX market, a good solution for FX trading firms would be to design a data collection system or adopt databases or data warehouses so that data capture becomes more efficient and effective, says Dil.
Just as data warehouses have become more sophisticated thanks to technology, so has data virtualization and business intelligence made the data extraction task easier to perform. “Massive amounts of data are hard to store, and extracting useful information from such data is even harder,” says Dil. But an effective data management system is critical to successful information extraction and can produce a number of operational benefits.
“It can reduce risk of data errors and improve data quality, increase speed and accuracy of access to data, make it easier to maintain existing data and good for ongoing updates, and show clearer relationship of multiple data sources so that information can be extracted more effectively,” says Dil.
The latest battleground in investment and portfolio management is data, says Dil. “The data can range from basic credit card transaction history to satellite data that tracks shipping routes. Portfolio managers and investment professionals can analyse the data to help them develop better trading strategies.
“For example, investment professionals can create data-driven investment models that can objectively evaluate public companies globally. These models can collect data ranging from publicly available data like FX market data to unconventional data like internet web traffic and satellite imagery. Investment professionals can gain an informational advantage and make more informed investment decisions.”
The ‘Big Data as a Service’ model is a tool that has developed to address the demands from portfolio managers and investment professionals for better management and analysis, says Dil. “This can range from the supply of data, to the supply of analytical tools with which to interrogate the data to carrying out the actual analysis and providing solutions.” According to one estimate from Forbes, the global big data market will be worth $88 billion by 2021 and the market of Data as a Service model will be $30 billion.
The unique structure of the FX market does create a number of challenges in terms of data capture and analysis, says Dil. However, data is only going to become more critical in FX trading so these challenges will have to be overcome.
In addition to the most glaring challenge – the absence of a central store of quote, order and trade data - there is also a very high volume and velocity of quote data, even if it is difficult to ascertain just how volume there is at any one time. There is also increased scrutiny from regulators, meaning that more data may be required to support trading decisions or to prove best execution.
For many years, the FX market has escaped the kind of regulatory reporting burden that has fallen on other asset classes. However improvements in transaction costs analysis along with the fallout from the FX pricing scandals perpetrated by some of the biggest global custodians has put the FX market in the crosshairs of international regulators and is likely to increase the reporting requirements for firms.
Added to all of this is the growing competition in the FX market, making it harder for sell-side firms to retain their clients and for buy-side firms to maintain yields.
Fortunately, says Dil, the emergence of the FX Analytics as a Service (FXAaaS) model is able to offer some relief to these challenges. “It can help model users to identify those relationships that are providing real value to the business, and it can help liquidity providers to customise their FX services to meet the requirements of every client and to optimise the liquidity provider experience and ultimate make the FX business more profitable.”
Just as latency has become a sought-after quality in trade execution, so has latency and the need for speed made its way into the world of big data analytics, says Dil. “Big data analytics is moving beyond the realm of intellectual curiosity and is beginning to tangibly affect business operations, offerings, and outlooks. Real-time analytics has become a requirement for optimszing business decisions.”
Similarly the development of cloud and mobile technologies are providing enterprises of all sizes with opportunities to use big data and analytics to make better, data-driven decisions, says Dil. “Without cloud, collecting data from all internal applications, social networks, devices, and data subscriptions could be expensive. A cloud environment gives a robust data foundation, fast time to value, improved collaboration, quicker adoption, scalability and elasticity and, ultimately, lower Total Cost of Ownership.”
Big data analytics are central to the future of FX execution, and have the potential not only to improve the way that firms trade, but also go some way to meeting the increasing demands of regulators, says Dil. “Having the tools to analyze the massive amounts of data produced in the currency market can help firms to spot patterns and correlations in their trading style. Firms can incorporate machine learning and algorithmic trading techniques into their FX execution strategy to crunch data. By incorporating machine learning and forms of algorithmic trading techniques, portfolio managers or traders can be able to expand upon their execution and trading strategies, meaning they will be able to make more informed decisions.”
In today’s data-rich environments, too much information is likely to be as big a problem as a lack of information. For example, a number of sell-side firms are finding that too much information is leading to a problem with monitoring.
“We have metrics overload,” says Alex Ridgers, Director of Analytics and Trading at MahiFX. “So being able to observe which metrics are doing something significant and which we should actually care about are becoming the more important questions that we need automated analytics to solve.”
For FX brokers, a lot of their analytics work is done by the minute, based on millisecond by millisecond information. “This is not (only) for the purpose of monitoring speed of infrastructure, but also asking ourselves if we could have reacted faster/ better to the available market information,” says Ridgers. “So storing every single tick of every single pricing input is crucial for us or we fall behind. A lot of people talk about risk management as the most important tool a broker has - but the foundation of everything is pricing. The liquidity provider that wins a trade in aggregation is often the one last to react to a price jump - so the broker ends up pricing in line with the worst of the aggregation, not the best. To avoid this, the broker either needs the data themselves and the systems people to analyse it to remain smart, or they need someone like MahiFX to have the data, tools and analytics in place for them.”
The challenge of data capture and analysis in such a data-intensive marketplace also means that the traditional in-house data management systems and stock database solutions like MySQL are no longer adequate, says Ridgers. “In terms of building something yourself - it will usually only fall to companies with relatively deep pockets. The free database solutions such as mySQL just cannot cope with the amount of data throughput required.”
But this does not necessarily mean that only the largest market players will be able to employ a successful data management strategy. There is still room for the smaller companies that can keep up with the speed of data and can correctly and efficiently identify what data is important to them.
“FX moves so fast that the right risk strategy for sell side firms changes from week to week,” says Ridgers. “So therefore data from two months ago is of no use. This brings hope to the smaller firms on a budget - if you only need the insight of a month’s worth of data then your storage and processing requirements drop considerably. This is where I believe ‘quantamental’ thought processes have arisen from - bulletproof quantitative strategies are impossible to find so reducing the significance of results requires a human brain to gauge significance and apply common sense, which brings us back full circle to require greater creativity to make use of less data. Back testing a strategy over two years is something best left in the 1990’s,” says Ridgers.