AllianceBernstein: fine-tuning shrinks gen AI ‘hallucinations’

Asset manager says its tweaks have improved accuracy of LLM models.

AllianceBernstein, the investment firm with $646 billion of assets under management, says it has boosted the accuracy of its generative artificial intelligence models to 95%—far outstripping the 60–70% achievable by ChatGPT.

In the year since OpenAI launched ChatGPT, AB has been tinkering with the generative AI models it uses to reduce the risk of ‘hallucinations’, where the underlying technology creates outputs that are nonsensical or flat-out wrong.

Andrew Chin, AB’s chief risk officer and head of quantitative research, says the firm has deployed a range of techniques to improve its models. The asset manager uses OpenAI LLMs, such as GPT 3.5 and 4.0 Turbo, as well as DaVinci and Meta’s Llama. 

“When we start fine-tuning, we can easily get to 85% accuracy or so,” he says. “Then when we improve on some of the prompts we put in, we can get it to 95%.”

Generative AI is designed to produce content, such as images and text, that can be indistinguishable from content created by humans. AB is using the technology in areas where the cost of a mistake is low: processes such as sales training and reading documents across its operational flow, such as prospectuses, legal reports and regulatory filings.

When we start fine-tuning, we can easily get to 85% accuracy or so. Then when we improve on some of the prompts we put in, we can get it to 95%
Andrew Chin, AllianceBernstein

Following ChatGPT’s launch last November, AB began experimenting with AI in the writing of commentaries and market outlooks for clients. The most prominent risk from using AI for such tasks concerns incorrect statements being generated. AB has attempted to mitigate the risk through a technique known as prompt engineering: by questioning the models in a specific way, AI can be made to give answers that are more relevant for the context. AB says prompt engineering has boosted accuracy by 10-15%.

“We start all our prompts with, ‘pretend you’re a financial analyst, studying x, y, z’,” says Chin. “It sounds kind of obvious, but if you put that there it really focuses the type of response that the models have.”

Calling the shots

Another technique employed by AB is few-shot learning, which enables the AI models to make predictions based on a limited number of samples. The approach works by feeding a pre-trained language model with an additional, limited sample of new data from which it learns to produce similar text. The model hones its language so that it is suited to a specific task.

One of the common hurdles encountered by firms applying machine learning to tasks such as alpha generation or predicting recessions is that the technology performs best when it has many examples to learn from. This is challenging because the signal-to-noise ratio in financial markets is so low and much of the risk is driven by idiosyncratic factors. AI hedge fund Duality Group is among the companies to have developed machine learning models that, like humans, are able to learn broad ideas from small amounts of data.

AB originally set out to use machine learning on the investment side to achieve outperformance. However, over the last couple of years, the asset manager has focused its efforts on improving productivity on the operational side.

In the case of writing market commentaries, AB fed its generative AI models an example of a previous commentary written by humans.

“It learns from this example to write something that’s similar to what you’ve written,” Chin says. “Again, it helps it direct what type of response you’re looking for.”

Golden retrieval

A third technique that AB has applied is retrieval augmented generation. This involves retrieving data from the specific documents a model is meant to learn from and then directing the model to draw on this data, rather than its broader training, to generate responses. The approach can be compared to answering questions from a reference text rather than from memory. 

 

Imagine using an AI model to scan a 300-page document for mentions of inflation or rates. The first part of the process—the retrieval—would involve the model going through all the pages and pulling out instances where inflation is being discussed. The next step would be to feed this data back into the model and teach it to only generate responses from the information thus provided. 

“That's actually becoming a very popular method because it forces generative AI to answer using only the data available to it in the original document,” Chin says.

The potential use for AI that is generating the most excitement relates to summarising documents. On the investment side, AI might be applied to reading memorandums that can be hundreds of pages long and producing short summaries.

Asset managers, Chin says, are trying to figure out how they can use AI on their own data.

“For us, the application will be on research reports, internal documents,” he adds. “It could also be on compliance.”

Editing by Daniel Blackburn

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

Most read articles loading...

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here