Deep XVAs and the promise of super-fast pricing

- Mauro Cesa, Luke Clancy
- 30 Jun 2021

Tweet
Facebook
LinkedIn
Save this article
Send to
Print this page

Need to know

Quants say the Covid-19 crisis last year, when markets gyrated, highlighted the need to compute derivatives valuation adjustments (XVAs) in closer to real time.
Banks are now combining neural networks and modern mathematical techniques to try and speed up calculations.
Using machine learning dispenses with the need to run millions of simulations for pricing derivatives, allowing banks and tech vendors to approximate the outputs of existing models.
Some banks are sceptical about the amount of retraining neural networks might need to adjust to changing market conditions, while others believe current technologies are sufficient for calculating XVAs.
Banks with large trading books and complex products also face barriers to implementation.

Monte Carlo is synonymous with fast cars, fast money and—in financial circles—a painfully slow way of valuing derivatives.

Now, banks are turning to machine learning in a bid to give their pricing models a turbo-boost.

The technique involves training so-called deep neural networks to approximate the results of Monte Carlo models without having to run millions of simulations.

“The neural network approximates the price of your portfolio when you’re running gigantic, complicated XVA calculations. It’s way faster than doing brute force pricing,” says Mark Higgins, co-founder of quant platform Beacon.

At least three banks are known to be using neural networks to generate derivatives valuation adjustments (XVAs). Others are considering it. Technology vendors, too, are looking to integrate the new method into their products.

The appeal of faster pricing is that banks are more likely to win trades from clients in a risk-controlled manner than if they have to run slow, computationally intensive simulations before quoting for business. The limitations of existing models were especially felt during the Covid-19 crisis last year when some systems struggled to keep up with the breakneck pace at which markets were moving.

“From an information system point of view, on post-trade there was no problem—we had all our sensitivities and valuation on time,” says an XVA head at a European bank. “But on pre-trade, not having real-time prices was an issue.”

In maths terms, derivatives pricing and hedging is a high-dimensional problem, with numerous input variables. When pricing a trade, banks must adjust the fair value of a derivative to account for counterparty credit, funding, and regulatory capital costs. Computing these valuation adjustments usually involves running a large number of Monte Carlo simulations to work out the paths a trade in a portfolio may take over its life. At the same time, banks would adjust for changes in risk factors such as the underlying’s price, interest rates, or volatility using a process known as ‘bump and revalue’.

The result is a mind-boggling number of calculations, usually taking several hours or even days. The process requires an array of CPUs, or central processing units, and must be repeated every time market conditions change materially.

Nowadays, most banks have ditched the bump-and-revalue method in favor of a mathematical technique called adjoint algorithmic differentiation (AAD), which allows for simultaneous calculations, allowing this step to be completed in a matter of minutes. This can be further accelerated with the use of GPUs, or graphics processing units, which allow for parallel processing.

Deep neural networks approach the problem from a different angle. A neural network consists of a series of nodes that process inputs according to pre-set instructions. The structure is loosely designed to mimic the activity of neurons in the human brain. Deep neural networks have additional layers of nodes, enabling them to recognize more complex patterns in large datasets.

For derivatives pricing, the deep neural network is fed with the inputs and outputs of a bank’s existing pricing models and then trained to learn the relationships between them. Once the training is complete, the network should be able to approximate the outputs in real time, without the need to calculate each successive step along the way.

Antoine Savine, chief analyst in the quant research group at Danske Bank, says deep neural networks are especially well suited to replicating derivatives pricing functions, as they excel at high-dimensional problems. “They learn to meaningfully reduce dimension from data. Out of 500 variables, they will find the 15 most meaningful nonlinear features of the state,” he says.

Taking the plunge

Scotiabank is among the institutions already using the technology for XVAs. Last year, the Canadian bank began using a deep neural network developed by Riskfuel, a fintech startup, to approximate pricing models. Riskfuel took Scotiabank’s Monte Carlo models and trained its neural nets to replicate them. The outputs are then fed into AAD models to extract the sensitivities and valuation adjustments.

But the technique is not limited to replicating the work of the Monte Carlo simulations. Danske Bank has extended the use of deep neural networks to approximating the XVA sensitivities generated by the AAD models, too. By adding the sensitivities to the training set, the network learns from sensitivities and paths at the same time. This improves the informative power of the training set, without compromising calculation speeds. The new system, which was developed in-house, is already being used by traders and risk managers, and is expected to report official numbers to regulators later this year.

More broadly, the use of deep neural networks does not mean that the old-school Monte Carlo simulations or AAD models are rendered obsolete. Dealers will still need these models to churn out the training set for their neural networks. But the advantage lies in the time saved in making the pricing calculations.

Youssef Elouerkhaoui, global head of markets quantitative analysis at Citi, explains that deep neural networks remove the need to construct regressors for the present value functions produced by the Monte Carlo approach.

“You have a bunch of Monte Carlo simulations on which you’re computing the actual present value function. And then you do a regression of your deep learning approximation on those paths. The benefit is that it is generic in a sense, because it just learns the actual functional form of the present value function, as opposed to having to craft a very specific polynomial function format,” he says.

Yes, we are exploring some of those techniques, but at the same time we do value the flexibility and accuracy we get from more traditional approaches

Issam Lagbouri, JP Morgan

Elouerkhaoui would not confirm whether Citi is using deep neural networks for derivatives pricing. But he believes they will soon become a “standard” part of the XVA pricing toolkit, alongside AAD and GPUs. “If an institution started from scratch, you would want to implement all three,” he says, adding that such a system, built from the ground up, should be able to generate pre-trade valuations and sensitivities “on the fly”.

A director of portfolio analytics at a large European bank is experimenting with deep neural networks for XVA calculations. The first step has been to apply machine learning to calculate margin valuation adjustment (MVA), which accounts for the cost of funding initial margin requirements. The project is a work in progress. “We took the view that we should consolidate and redevelop XVA analytics in a modern way, so GPUs, AAD and machine learning are all taken into account,” says the director.

Jean Jacques Kamdem, global head of traded credit analytics at HSBC, says the growing complexity of XVA calculation is pushing the limits of existing technologies. When the pandemic roiled markets, the bank was forced to adjust the risk factors in its pricing models to “embed the dimensionality required for the sensitivity of a simple valuation”, he says. That led the bank to rethink the way it captures data dependencies in its models. “And when you think about that, you start thinking of a neural network,” says Kamdem. “It has led to more consideration of neural networks in general in the computation of XVAs.”

Fast, but not that fast

The idea of using neural nets to speed up derivatives pricing was met with scepticism in some quarters at first.

When Microsoft claimed in a blog post last year that Riskfuel’s deep neural networks, running on its Azure cloud platform, could perform derivatives valuations 20 million times faster than existing models, several leading quants rubbished the claims in a LinkedIn discussion.

Jesper Andreasen, head of quant research at Saxo Bank and former head of quant research at Danske Bank, scoffed that the idea only made sense if you had “more computers than brain cells”. The length of time it would take to train a neural network to approximate the pricing functions of derivatives—six CPU years, by his estimation—made the whole thing impractical, he said.

Henrik Rasmussen, the former global head of quantitative analytics at Standard Chartered, also chimed in to say that if the speed-ups came at the expense of accuracy, the approach “wouldn’t fly on the trading floor”.

Ryan Ferguson, founder and CEO of Riskfuel, responded that its deep neural networks could be trained in a day on Microsoft’s cloud platform, which gives users on-demand access to 5,000 GPU cores, and that the results were as reliable as traditional Monte Carlo simulations.

The discussion was left hanging on LinkedIn, but doubts over the accuracy of the outputs generated by deep neural networks and the time required to train and then retrain them as market conditions change remain the main barriers to adoption.

Ferguson says deep neural networks can be trained overnight with the proper hardware and should not need frequent retraining. “If you’re cautious about your assumptions, then you will likely never need to retrain,” he says.

He adds that Riskfuel’s models proved resilient to the Covid-related disruption in markets last year because the company made conservative assumptions about the future level of interest rates, volatilities and other market conditions.

The volatility we saw [last year] definitely makes you think, if my risk calculation was taking four hours to run, I really would rather it takes four minutes

Jon Gregory, XVA expert

Not everyone is convinced that deep neural networks won’t require frequent retraining. The head quant at a bank that is considering licensing Riskfuel’s models to approximate the prices of derivatives says: “I think the premise is that you don’t need to retrain it every day, but I don’t think that means it is good for ever. It sounds far-fetched.”

Quants have found ways to reduce training times and costs. Most deep pricing implementations train models on examples of prices, which are expensive to generate. In the spirit of least-square Monte Carlo, Danske’s quants train deep neural networks on sampled payoffs and their derivatives—or pathwise differentials—so entire datasets are produced for a cost similar to only one pricing by Monte Carlo.

“Our research has resolved some key problems in the application of machine learning to risk,” says Brian Huge, chief quant analyst at Danske Bank and colleague of Savine. Huge says quants from several banks have been in touch to find out more about the technique. However, this approach only works if the bank has a system in place that represents the cashflows of all transactions in a consistent manner, which is not always the case.

The training may also be particularly taxing for banks with large trading books comprising a variety of more complex products.

This view is borne out by JP Morgan, which has one of the largest trading books on the Street, with millions of trades priced daily across tens of thousands of counterparties. To date, the bank’s technology and quant group has employed machine learning in a limited capacity to reduce the need for coding pricers and analytics to deal with new products, but not so far to speed up calculations.

Issam Lagbouri, global head of derivatives quant research for the credit portfolio group at JP Morgan, says this is because the models the bank uses to price derivatives are “constantly evolving”. That means a deep neural network would need regular retraining to stay current.

“These techniques can make pricing quicker, but they require a more stable regime where there’s no need to change the inputs to your model—the risk factors and various things that you need to price trades. Yes, we are exploring some of those techniques, but at the same time we do value the flexibility and accuracy we get from more traditional approaches,” Lagbouri says.

JP Morgan retrofitted AAD into its vast libraries of pricing models over the course of three years. Its XVA system uses large data centres running CPU and GPU cores to speed up pricing. This hardware currently sits in-house, but the bank is in the process of moving the majority to public cloud this year and next.

Graeme Keith, global head of credit portfolio group derivatives at JP Morgan, says the combination of AAD and cloud creates a cost-effective way for XVA traders “to do more things on demand, especially useful when dealing with large market moves like those we experienced in the Covid crisis.” He adds: “We can enable traders with a few button clicks to select their counterparties, their hedges, build a scenario and run risk on it.”

The head quant at a second large bank also believes current technologies are sufficient. The bank has implemented AAD and moved its processing to the cloud, but “hasn’t had any reason to explore other approaches” says the quant. “What we do is already very performant, so we don’t have a pressing need for speed-ups.”

Network effect

Questions about training and accuracy aside, most quants are optimistic of the role that deep neural networks can play in valuing derivatives books.

Riskfuel’s Ferguson says that of the roughly two dozen banks he has spoken with, around half are already exploring machine learning for XVAs.

Jacques Du Toit, product manager at vendor Numerical Algorithms Group, predicts that most banks that have implemented AAD for pricing will end up combining that with machine learning.

The pairing also makes sense to Jon Gregory, an independent expert specialising in counterparty risk and XVA. He says that while AAD and machine learning don’t have to go together, most banks want to speed up pricing calculations and are prepared to “throw multiple tricks at the problem. Machine learning applied to pricing libraries, because that’s generally the bottleneck in an XVA calculation. And AAD for sensitivities because they represent a problem when done through bump and revaluate. I think those two together could become a standard.”

But he acknowledges new implementations may be tricky for banks with relatively small resources or that have complex portfolios.

What does it mean for banks that are potentially left behind though?

In seesawing markets, says Gregory, “they may have some serious issues, which is what was discovered last year. The volatility we saw definitely makes you think, if my risk calculation was taking four hours to run, I really would rather it takes four minutes.”

And even those who were initially sceptical are coming round to new ways of thinking. Saxo’s Andreasen is considering applying machine learning to pathwise differentials. Along the same lines as Danske Bank, he expects a neural network can learn the structure of a portfolio’s function of underlying risk factors, as well as model co-movement of volatility surfaces.

For many dealers, the holy grail of real-time pricing of derivatives may just have moved a little closer.

What vendors are doing

Technology vendors are eager to ensure they don’t miss out on a potentially important development in the lucrative area of derivatives pricing. Bloomberg, FIS and IHS Markit are exploring the use of deep neural networks for XVA calculation.

The offerings are largely being marketed to lower-tier or regional banks, which are deemed less likely to develop their own machine learning solutions in-house.

Bloomberg is researching combinations of AAD, machine learning and the cloud for its XVA analytics products. Gerry Frewen, a product manager for XVA at Bloomberg, says the vendor is seeing demand from clients for better quality sensitivities, both as a result of increased hedging of XVAs, and also the forthcoming finalisation of Basel’s standardised approach to credit valuation adjustment (SA-CVA). The new rule, due to go live in January 2023, requires banks to hold capital against variability in the adjustment made to the fair value of derivatives to account for counterparty credit risk.

Frewen says unlike the current Basel III CVA capital framework, SA-CVA requires large numbers of CVA sensitivities to be calculated. Banks then face a difficult choice between a large cost due to the computation requirement, or a more punitive capital treatment if they resort to the simpler, basic approach to CVA.

A second vendor, FIS, is modernising its XVA platform to incorporate AAD. Due for roll-out later this year, the offering aims to help clients actively manage XVA in the front office as well as calculate CVA capital.

Markus Seiser, division executive, cross-asset trading and risk at FIS, says that machine learning is the firm’s next focal point in research—to accelerate XVA pricing while maintaining standards of accuracy.

IHS Markit has also used neural networks to solve XVA pricing challenges over the past year—in its case for margin valuation adjustment – but has not implemented AAD.

Allan Cowan, who heads financial engineering at the firm, says pricing MVA is particularly challenging when using the Isda Simm approach because portfolio sensitivities must be projected into the future, with an order of magnitude increase in the number of valuations required. “Now you need your sensitivities per time-step and per Monte Carlo path when you’re trying to project initial margin,” he says. “We have trained neural networks to learn the pricing functions in order to get the sensitivities in a very efficient way.”

The result is an XVA engine that can price products 1,000 times faster than conventional bumping methods, Cowan says.

A fourth vendor, meanwhile, says XVA is a calculation too complex and dependent on many factors to be handled by machine learning, although AAD can be helpful for XVA sensitivities calculations.

Dmitry Pugachevsky, head of research at Quantifi, says the firm has instead worked with chipmaker Intel to identify opportunities to improve XVA calculation performance through better data transfer mechanisms in its processors. Upgrades have reduced the input and output time required for data transfers from 23% to 8% of the overall total, Pugachevsky says.

Deep XVAs and the promise of super-fast pricing

Intelligent robots can value complex derivatives in minutes rather than hours

Need to know

Taking the plunge

Yes, we are exploring some of those techniques, but at the same time we do value the flexibility and accuracy we get from more traditional approaches

Fast, but not that fast

The volatility we saw [last year] definitely makes you think, if my risk calculation was taking four hours to run, I really would rather it takes four minutes

Network effect

What vendors are doing

Further reading

More on Emerging Technologies

This Week: Startup Skyfire launches payment network for AI agents; State Street; SteelEye and more

Waters Wavelength Podcast: Standard Chartered’s Brian O’Neill

SS&C builds data mesh to unite acquired platforms

Chevron’s absence leaves questions for elusive AI regulation in US

Reading the bones: Citi, BNY, Morgan Stanley invest in AI, alt data, & private markets

Startup helps buy-side firms retain ‘control’ over analytics

The IMD Wrap: With Bloomberg’s headset app, you’ll never look at data the same way again

LSEG unveils Workspace Teams, other products of Microsoft deal

You are currently on corporate access.