IBM report finds ‘shadow’ data significant contributor to data breaches
As AI and cloud take on greater importance in the capital markets, firms need to consider their threat impact zones.
A new report released by IBM finds that as the average cost of data breaches has increased, a significant portion of that spike can be attributed to the risk of shadow data.
Shadow data is comprised of disparate data sources that reside across different environments that aren’t controlled by a single entity or environment. The report looked at data breaches that occurred across multiple industry sectors—including financial services—between March 2023 and February 2024. It found that the average cost of a data breach increased 10% over the previous year ($4.88 million from $4.45 million). Of companies researched (see box), 35% involved shadow data, and those instances were 16% more expensive because those breaches took longer to identify and contain.
“When we see the proliferation of AI systems—and we’re building in more large language models across different AI platforms—the whole notion of this data sprawl has exploded,” Stephen Coraggio, senior partner of financial services security for IBM Consulting, tells WatersTechnology. “It’s been a massive challenge for organizations because different business units are deploying their own LLMs across different AI models, and a lot of that sits outside of the CISO’s purview.”
From the report
- The research was conducted by the Ponemon Institute. It was sponsored, analyzed, and published by IBM. This is the 19th iteration of the survey.
- It looked at 604 organizations impacted by data breaches between March 2023 and February 2024.
- Breaches observed ranged from 2,100-113,000 compromised records.
- 3,556 security and C-suite executives were interviewed across 17 industries, including finance, healthcare, industrial, technology, and energy.
Another issue is that standard data storing methods are changing—ultimately for the better—but it creates new considerations. Nearly every large bank, asset manager, exchange, or vendor uses a combination of public cloud data storage, private cloud, and in-house databases. According to the report, this can lead to data being “invisible” to IT, as employees share data through unauthorized applications or upload data to “unofficial cloud buckets.”
Coraggio says that firms need to have a solid structure for managing data including well-defined methodologies and frameworks to prioritize and protect data to prevent sprawl. This means having structures in place to identify and articulate who has access to which datasets, how that data is being leveraged for “business contextual purposes,” and how the data is being structured, bundled, managed, and operated.
“There’s no question that data has been the biggest challenge of our clients in terms of balancing the risk of growth with the requirement to manage security. It’s impossible to protect all your data in an environment, so taking a risk-based approach to data and saying, ‘Where is our most critical data? Where does it lie? How would we put the most protective controls around that data?’” Coraggio says. “Then lower your requirements as you move down the criticality of that data is the way that clients are handling it. You can’t encrypt everything. You can’t tokenize everything in an environment. Prioritizing that based on criticality and impact of that data to the organization is exactly what top organizations are doing.”
Fernando Montenegro, principal analyst for cybersecurity infrastructure at Omdia, says that the company’s own annual survey points in a similar direction as IBM’s. The majority of respondents said that they felt data stored in cloud environments was “outside of [their] control”—this was especially true for companies with 10,000 or more employees.
“The advice I’d give is that it’s critical for organizations to ‘breach—pun intended—the divide’ in terms of clearly aligning IT efforts to business criticality and sensitivity in terms of data used, created, or shared,” he says.
White hat AI?
While AI and LLMs can introduce new complexities—both in terms of data sprawl and defending against ever-more-sophisticated bad actors—they’re also valuable defense mechanisms.
“In financial services specifically, organizations that are using AI and automation extensively—so there’s a caveat there—can identify and contain breaches 100 days faster than organizations that don’t leverage those technologies,” Coraggio says. “To me, that’s an extremely powerful statement.”
While obvious at face value, properly deployed AI can be used to detect anomalies and respond to breaches exponentially faster than manual processes. In theory, AI models can be used to identify a specific threat and create a ticket, says Coraggio. Depending on the use case, the AI automatically deploys either a patch, remediation, or a “contain” model. This will either prevent the attack or block it from spreading at either the endpoint, server, or some other environment, Coraggio says.
The report states that the return on investment is tangible. Organizations not using AI and automation reported average breach costs of $5.72 million, while those making “extensive”—which the report doesn’t define—use of AI and automation saw breach costs of $3.84 million.
If AI provides such clear benefits, why aren’t more firms “extensive” users of AI? Well, Coraggio points to cost, both in terms of talent acquisition and building systems internally. While it should be noted that IBM is a provider of cybersecurity tools and a managed services provider, the report does state that the number of organizations facing a “critical lack of skilled security workers rose dramatically to 53% in 2024 compared to 42% last year.”
Omdia’s Montenegro notes, though, that he’s seeing firms take an intentionally cautious approach to AI. Whether using Bayesian models for spam filtering or using machine learning for anti-fraud and anomaly detection, AI has been used in cyber defense for a long time in finance.
Though advancements in deep-learning models and generative AI are re-shaping this long history, they bring greater concerns around the cost of entry and potential regulatory oversight than before.
“There’s some promising work about enhancing the performance of security operations teams,” Montenegro says, “but the reality is that implementations are always more complex than demos, and teams need time to find how to make it work for them.”
Further reading
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Data Management
New working group to create open framework for managing rising market data costs
Substantive Research is putting together a working group of market data-consuming firms with the aim of crafting quantitative metrics for market data cost avoidance.
Off-channel messaging (and regulators) still a massive headache for banks
Waters Wrap: Anthony wonders why US regulators are waging a war using fines, while European regulators have chosen a less draconian path.
Back to basics: Data management woes continue for the buy side
Data management platform Fencore helps investment managers resolve symptoms of not having a central data layer.
‘Feature, not a bug’: Bloomberg makes the case for Figi
Bloomberg created the Figi identifier, but ceded all its rights to the Object Management Group 10 years ago. Here, Bloomberg’s Richard Robinson and Steve Meizanis write to dispel what they believe to be misconceptions about Figi and the FDTA.
SS&C builds data mesh to unite acquired platforms
The vendor is using GenAI and APIs as part of the ongoing project.
Aussie asset managers struggle to meet ‘bank-like’ collateral, margin obligations
New margin and collateral requirements imposed by UMR and its regulator, Apra, are forcing buy-side firms to find tools to help.
Where have all the exchange platform providers gone?
The IMD Wrap: Running an exchange is a profitable business. The margins on market data sales alone can be staggering. And since every exchange needs a reliable and efficient exchange technology stack, Max asks why more vendors aren’t diving into this space.
Reading the bones: Citi, BNY, Morgan Stanley invest in AI, alt data, & private markets
Investment arms at large US banks are taken with emerging technologies such as generative AI, alternative and unstructured data, and private markets as they look to partner with, acquire, and invest in leading startups.