Machine Learning in the Capital Markets 2020: ML Spreads During Pandemic
WatersTechnology looks at how 10 different firms are embedding machine learning algorithms into their platforms and tools.
Various forms of machine learning techniques are beginning to permeate through the capital markets. From trading and portfolio management platforms, to reconciliations and surveillance systems, ML algorithms are finding a home within the major investment banks, asset managers, and vendors.
While there are a fair number of snake-oil salesmen when it comes to this form of AI and marketing, we look at some of the more interesting projects involving machine learning from the past year.
IBM & Refinitiv
The market effects of the Covid-19 outbreak have thrown off machine learning models that historically relied on correlations between different types of datasets—correlations that are no longer making sense. “All of our tools are fundamentally correlational,” David Cox, IBM director of the MIT–IBM Watson AI Lab. “[When] people talk about big data, they talk about recognizing patterns in the data; what they are finding is correlations—this variable is correlated with that variable. Now, there are a couple of problems with that.”
To address these kinds of correlational problems, Refinitiv Labs and MIT–IBM Watson AI Lab have been working together on causal inference, an emerging AI discipline.
First, some history. The concept of deep learning, which relies on complex neural networks, has been around for decades, but in 1982, it had its own renaissance. The problem it then ran into was that the hardware wasn’t there to commercialize this highly-advanced subset of machine learning. It wasn’t until 2012 that neural networks started to evolve as GPUs emerged and CPUs grew exponentially faster.
There was finally enough power to make these neural nets run efficiently, and that opened Pandora’s box. Data scientists began looking at other problems that were previously inaccessible around reinforcement learning and more advanced and complicated statistical methods. It was only a matter of time until capital markets firms started experimenting with deep learning.
Causal machine learning is the next link in a long evolutionary chain that started with statistics, which matured into data science, which then became machine learning, then deep learning, then reinforcement learning, and then statistical learning. It’s an advancement built upon the needs of the more modern methods in modeling technologies. At its core, you can crunch more numbers more efficiently.
“Rather than trying to predict something based on correlations, you are really trying to dig in and understand those cause-and-effect relationships both through sophisticated statistical tools, but also in many cases through interventions,” Cox says.
The IBM–Refinitiv project is still in the early stages, and involves testing out different theories. After that, Hanna Helin, head of emerging tech strategy and alliances at Refinitiv Labs, said the companies will “engage with customers” to incorporate what they’ve learned into clients’ internal analytics processes.
Some believe that causal machine learning will represent a major evolution in the field of AI.
Goldman Sachs
The Wall Street giant is in the process of upgrading its Marquee platform, which provides institutional investors with market views, hedging tools, and trade execution across multiple asset classes. Through this upgrade, Goldman Sachs is working to build out its cloud strategy, buying further into the open-source arena and adopting APIs to develop microservices for topics like ESG or the US presidential election. The end result would be the bank, through its Marquee offering, having developed managed services for ingesting data, sourcing data, or maintaining a security master, though it’s still early days.
This move toward cloud-based services will also allow the bank to better embrace machine learning and other analytics tools so that Marquee can be more proactive in delivering information to users.
“As we build out content, we want to consider how we can use machine learning to create a recommendation system for Marquee, similar to what Netflix or Amazon does,” said Anne Marie Darling, a partner at the firm who wears several different hats. “If you like these five things, you should also be thinking about this. And so we are moving to be proactive with clients about delivering and personalizing things that they value. I think that will be the next evolution for us around content as we head into 2021.”
BNP Paribas Asset Management
BNP Paribas Asset Management is in the process of rolling out an NLP-based model that finds sentiment indicators in news reports to forecast company returns.
For the project, the firm is relying on an external data provider with the necessary licenses to scrape reputable online news outlets. The vendor then structures the data, removing unnecessary words, and producing a numerical signal from the material information. BNP Paribas then uses the structured data to construct its models, selecting only the information that is necessary to build long-term investment signals. Within the model, the news is classified by different topics that cover various aspects of a company’s operations and activities.
Raul Leote de Carvalho, deputy head of BNP Paribas Asset Management’s Quantitative Research Group, said the current focus of the project, which BNP began working on in early 2018, is on finding sentiment signals for equities, but the firm plans to also look at corporate bonds in the future. BNP Paribas already uses various machine learning techniques for a variety of applications, including to build risk models.
Leote de Carvalho said machine learning has been particularly useful for conducting and improving upon principal component analysis (PCA), a technique that reduces the variables in a dataset to bring out more vivid patterns. This can be useful constructing covariance matrixes, which measures the relationship between assets in a portfolio.
To calculate the expected volatility of a portfolio, a risk manager must first determine expected volatility of each stock and then account for the diversification effect of buying many stocks, which requires a correlation matrix.
“If you put the volatility and correlation matrix together, you get the variance/covariance matrix, and that is a key question of risk modeling in equities: what is the best variance/covariance matrix you can come up with,” Leote de Carvalho said.
Then BNP separates the two parts. It uses econometric models for forecasting stock volatility, and for the correlation matrix it uses approaches such as PCA. Leote de Carvalho said that using PCA can reduce the matrix into “a much smaller matrix with just a few risk factors, and then it just tells you how each stock is exposed to the factors, and the number of factors—and how that reduction in dimensionality is done is unsupervised.”
Machine Learning Features
- Lighting Up the Black Box: A Must for Investors? More quant funds are using machine learning to help run some part of decision-making, but the workings can be opaque and the route to outcomes is often unknown.
- ‘Data Mining is Bullsh**t’ With the growth of alternative data in the capital markets, firms are struggling to find value, and are disillusioned by the loss of time, human capital, and money. Goldman Sachs’ Matthew Rothman believes this has created a situation where vendors and buy-side firms are promising vast riches, but much of that talk, he says, is BS. As you might expect, not everyone agrees.
- Model Misfires Raise Questions Over Training Data Quants wrestle with how far into the past their machine learning models should peer.
- Quant Funds Look to AI to Master Correlations Machine learning shows promise in grouping assets better and predicting regime shifts, say fund managers.
- Quants Use Nowcasting As Covid Crystal Ball Experts from UBS, Unigestion, MIT and QuantConnect discuss the need for nowcasting, and what the alt data boom has made possible in trying to navigate today’s crisis.
Broadridge
In the first quarter of 2021, Broadridge Financial Solutions will launch its AI-driven corporate bond trading platform, LTX. While the tech stack for LTX is live, the vendor is currently in the process of gathering liquidity on the platform and working out bugs.
The AI engine underpinning the platform was built using the open-source machine learning platform TensorFlow. It incorporates a convolutional neural network, a type of deep neural network used for image analysis.
“It is a neural network with multiple layers of neurons,” said Vijay Mayadas, head of capital markets at Broadridge. “One of the reasons we chose that model was because of the complexity of the corporate bond market. There is so much information in a set of trades that we felt a neural network was the best way to understand how to interpret all of that data in such a way that you can help a dealer come up with actionable insight.”
The AI will look at the dealers’ data and publicly available data to determine the optimal set of customers available that would most likely be the ones to invite to the trade with the greatest chance of executing a deal. After that, the AI picks the customers and puts them into a protocol, which then drives a type of auction mechanism to generate the best pricing for the buy-side customer, and also helps the customers on the other side of the trade with price discovery.
Crédit Agricole
Last year, Crédit Agricole’s Corporate and Investment Bank (CIB) restructured its markets division into two units, one focusing on financing and funding, and another dedicated to hedging and investment solutions, including trading, sales, structuring, and research.
Within the second unit, it also created two specialist technology teams focused on capital markets data and operational transformation. Their role is to ensure the front office can rapidly develop technical solutions to the problems and opportunities it encounters when dealing with clients.
To start with, the team focused its attention on markets with rich datasets. Interest rate swaptions were the perfect testing ground. The bank built a dataset of RFQs and used a machine learning-based model—which relies on straightforward decision tree techniques—to divine patterns within it.
But in an esoteric market, a pure machine learning-based approach can quickly run into problems. An algorithm will struggle to make sense of data unless it understands the context of the requests and what motivated the client to trade: the shape of the yield curve at that point in time or its rolldown on a particular day; the coupon’s z-score (its standard deviation from the mean); implied versus realized volatility levels; or a news announcement. This information was manually curated and fed to the model—meaning it operates in a ‘supervised learning’ environment.
Though most clients request prices for receiver and payer swaptions simultaneously to avoid showing their hand, 80% of the time a human trader can guess which way they want to trade. But the machines have started to beat them, with an accuracy rate of 85% on average.
That extra 5% has given the bank a crucial edge: the algo’s ability to spot patterns and learn the client’s typical behavior allows traders to pre-empt an inbound request, and be ready to show a highly competitive price at precisely the right time. Since it switched the tool on in April of last year, Crédit Agricole has become the largest swaptions counterparty to many fast-money clients, and the main provider of swaptions to one of Europe’s largest asset managers.
Arabesque Group
Through the asset manager’s Arabesque AI subsidiary, the firm is working to build out its artificial intelligence engine to predict stock price developments, using techniques such as decision trees, neural networks, and natural language processing.
Arabesque’s AI engine is built around five computational layers: 1. data input; 2. representing the pre-processed data into “features,” a term in machine learning that refers to alternative representations of the original data that are more suitable for consumption by machine-learning models; 3. the addition of various supervised machine-learning techniques to generate asset price forecasts from the data; 4. the combination of a battery of models; 5. the representation of the actual asset price forecast.
The company expects the system to go live in Q1 2021. “The proof of concept is there. There are certain components that we want to add to the proof of concept,” said Michael Neumann, head of AI quant investment strategies at Arabesque AI.
UBS
UBS Evidence Lab, which specializes in analyzing alternative datasets for financial services firms, is using machine learning to sift through shipping data.
Some shipping data is input manually by the ship’s captain or operators of the ship and is subject to human error. UBS Evidence Lab uses machine learning to distinguish odd or anomalous data points from human input errors.
A machine is ingesting data and then running it through a system, using various statistical techniques to identify odd behaviors in the data. The Evidence Lab has an anomaly detection layer that sits on top of the whole system and acts as a check for irregular patterns.
Jeremy Brunelli, global head of frameworks at UBS Evidence Lab, gave an example of a simple data chart that appears flat for seven consecutive days, but on the eighth day, the chart jumps and there is a spike in the data. The UBS system will flag that spike and notice that something is wrong, and the information will then be passed on to a data steward who can investigate the exception. The data steward will curate the data and label it “inaccurate,” making an adjustment to ensure the anomaly does not poison the analytics.
“That model can then be trained to see how that data steward reacts. The machine sees the behavior of that data steward every time that reaction happens. They start to look at each of those incidences and segment them into vectors,” Brunelli said.
The machines learn, and the number of exceptions in the data starts to diminish over time as the machine begins to make decisions about the anomalies ahead of the UBS data steward. Brunelli says that with this machine learning model, the data that has to be manually checked can go down from 7% to 2%, saving the team a significant amount of time.
IHS Markit
IHS Markit has released a new product, dubbed Risk Bureau, aimed at helping buy-side firms calculate and model their risk using alternative data, machine learning, and cloud computing.
By leveraging GPUs run on the AWS Cloud and incorporating machine learning, IHS Markit has reduced the time it takes to calculate valuation adjustments (XVAs) for complex and simple derivatives portfolios by 200% to 250%, compared to traditional Monte Carlo models. XVAs include credit valuation adjustments, funding valuation adjustments, collateral valuation adjustments, and capital valuation adjustments.
Where a Monte Carlo simulation uses a forward-looking stochastic process, Risk Bureau works backward with a regression technique. By pre-computing all the different simulated parts with machine learning, users who are plotting and moving data points around on a graph can cut the lag time associated with calculating paths of single lines down to milliseconds.
Machine learning will guide the further development of the tool, especially as it starts to include oil pricing and inflation. Currently, it shows a correlation between risk factors on-screen in the credit forecasting utility. Eventually, it will allow users to override that with what their firm decides the correlations are, and machine learning will provide the speed needed to accomplish that, so that the real-time component of the risk model stays intact.
Essentia Analytics
Essentia Analytics, which provides behavioral data analytics and consulting for investment firms, has enhanced its Insight Enterprise platform to allow internal peer portfolio managers to benchmark against one another.
The new tool, dubbed Multi-Portfolio Views, allows chief investment officers and heads of equities trading to have a clearer picture of their portfolio managers’ strengths and weaknesses, which would allow them to potentially structure their teams more efficiently. It can help higher-ups to determine which managers on their team are best at, for example, enter-timing and exit-timing, and which need improvement.
In 2021, Essentia plans to add external peer benchmarking that will show how a given manager’s skill compares with other external managers with similar strategies.
To see how two different hedge funds are using Essentia’s platform to improve trading efficiency, click here.
Synechron
Synechron, a New York-based consultancy and tech provider, is in the early stages of building differential privacy solutions in partnership with an unnamed start-up. The solutions will allow teams within a financial firm to analyze and query data from various parts of the business without breaching global data protection regimes such as the EU General Data Protection Regulation.
The tools will aim to automate tasks involved in data sharing where privacy is a concern. Anantha Sharma, senior architect for technology and innovation at Synechron, described one use case where an individual at a bank forwards data to another part of the business. In this instance, the privacy technology will function as an intermediary between the sender and the receiver by first combing through the documents or unstructured data. Then using machine learning techniques, it will identify any PII and sensitive information and either remove, hide, or obfuscate the data, unauthorized to be seen by the receiving end-user.
Differential privacy solutions should function differently depending on the context and the type of data they are applied to, whether it is structured or unstructured. Take chat data, for example. ML privacy tools may find it more challenging to understand a conversation between, say, a portfolio manager and their client if a lot of the context of the conversation is inherently understood by both parties. Other issues arise if some of the conversations have taken place outside of typical work channels, leaving gaps in the chat data and making it difficult for the ML to interpret the context accurately.
“The biggest problem, in this case, would be to infer the context correctly, and there are a lot of language models and techniques that need to be applied to understand and realize these contexts so that the [privacy] system will do a better job [at interpreting it],” says Sharma.
Differential privacy solutions only apply to numeric data. Therefore, once the ML accurately interprets the context, it can decide what parts of the PII or sensitive data—such as postcodes, bank details, dates, or phone numbers—need to be removed, hidden, or obfuscated before it reaches the receiver.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Data Management
New working group to create open framework for managing rising market data costs
Substantive Research is putting together a working group of market data-consuming firms with the aim of crafting quantitative metrics for market data cost avoidance.
Off-channel messaging (and regulators) still a massive headache for banks
Waters Wrap: Anthony wonders why US regulators are waging a war using fines, while European regulators have chosen a less draconian path.
Back to basics: Data management woes continue for the buy side
Data management platform Fencore helps investment managers resolve symptoms of not having a central data layer.
‘Feature, not a bug’: Bloomberg makes the case for Figi
Bloomberg created the Figi identifier, but ceded all its rights to the Object Management Group 10 years ago. Here, Bloomberg’s Richard Robinson and Steve Meizanis write to dispel what they believe to be misconceptions about Figi and the FDTA.
SS&C builds data mesh to unite acquired platforms
The vendor is using GenAI and APIs as part of the ongoing project.
Aussie asset managers struggle to meet ‘bank-like’ collateral, margin obligations
New margin and collateral requirements imposed by UMR and its regulator, Apra, are forcing buy-side firms to find tools to help.
Where have all the exchange platform providers gone?
The IMD Wrap: Running an exchange is a profitable business. The margins on market data sales alone can be staggering. And since every exchange needs a reliable and efficient exchange technology stack, Max asks why more vendors aren’t diving into this space.
Reading the bones: Citi, BNY, Morgan Stanley invest in AI, alt data, & private markets
Investment arms at large US banks are taken with emerging technologies such as generative AI, alternative and unstructured data, and private markets as they look to partner with, acquire, and invest in leading startups.