IBM report finds ‘shadow’ data significant contributor to data breaches

- By Anthony Malakian
- @a_malakian
- 30 Jul 2024

Tweet
Facebook
LinkedIn
Save this article
Send to
Print this page

A new report released by IBM finds that as the average cost of data breaches has increased, a significant portion of that spike can be attributed to the risk of shadow data.

Shadow data is comprised of disparate data sources that reside across different environments that aren’t controlled by a single entity or environment. The report looked at data breaches that occurred across multiple industry sectors—including financial services—between March 2023 and February 2024. It found that the average cost of a data breach increased 10% over the previous year ($4.88 million from $4.45 million). Of companies researched (see box), 35% involved shadow data, and those instances were 16% more expensive because those breaches took longer to identify and contain.

“When we see the proliferation of AI systems—and we’re building in more large language models across different AI platforms—the whole notion of this data sprawl has exploded,” Stephen Coraggio, senior partner of financial services security for IBM Consulting, tells WatersTechnology. “It’s been a massive challenge for organizations because different business units are deploying their own LLMs across different AI models, and a lot of that sits outside of the CISO’s purview.”

From the report

The research was conducted by the Ponemon Institute. It was sponsored, analyzed, and published by IBM. This is the 19th iteration of the survey.
It looked at 604 organizations impacted by data breaches between March 2023 and February 2024.
Breaches observed ranged from 2,100-113,000 compromised records.
3,556 security and C-suite executives were interviewed across 17 industries, including finance, healthcare, industrial, technology, and energy.

Another issue is that standard data storing methods are changing—ultimately for the better—but it creates new considerations. Nearly every large bank, asset manager, exchange, or vendor uses a combination of public cloud data storage, private cloud, and in-house databases. According to the report, this can lead to data being “invisible” to IT, as employees share data through unauthorized applications or upload data to “unofficial cloud buckets.”

Coraggio says that firms need to have a solid structure for managing data including well-defined methodologies and frameworks to prioritize and protect data to prevent sprawl. This means having structures in place to identify and articulate who has access to which datasets, how that data is being leveraged for “business contextual purposes,” and how the data is being structured, bundled, managed, and operated.

“There’s no question that data has been the biggest challenge of our clients in terms of balancing the risk of growth with the requirement to manage security. It’s impossible to protect all your data in an environment, so taking a risk-based approach to data and saying, ‘Where is our most critical data? Where does it lie? How would we put the most protective controls around that data?’” Coraggio says. “Then lower your requirements as you move down the criticality of that data is the way that clients are handling it. You can’t encrypt everything. You can’t tokenize everything in an environment. Prioritizing that based on criticality and impact of that data to the organization is exactly what top organizations are doing.”

Fernando Montenegro, principal analyst for cybersecurity infrastructure at Omdia, says that the company’s own annual survey points in a similar direction as IBM’s. The majority of respondents said that they felt data stored in cloud environments was “outside of [their] control”—this was especially true for companies with 10,000 or more employees.

“The advice I’d give is that it’s critical for organizations to ‘breach—pun intended—the divide’ in terms of clearly aligning IT efforts to business criticality and sensitivity in terms of data used, created, or shared,” he says.

White hat AI?

While AI and LLMs can introduce new complexities—both in terms of data sprawl and defending against ever-more-sophisticated bad actors—they’re also valuable defense mechanisms.

“In financial services specifically, organizations that are using AI and automation extensively—so there’s a caveat there—can identify and contain breaches 100 days faster than organizations that don’t leverage those technologies,” Coraggio says. “To me, that’s an extremely powerful statement.”

While obvious at face value, properly deployed AI can be used to detect anomalies and respond to breaches exponentially faster than manual processes. In theory, AI models can be used to identify a specific threat and create a ticket, says Coraggio. Depending on the use case, the AI automatically deploys either a patch, remediation, or a “contain” model. This will either prevent the attack or block it from spreading at either the endpoint, server, or some other environment, Coraggio says.

The report states that the return on investment is tangible. Organizations not using AI and automation reported average breach costs of $5.72 million, while those making “extensive”—which the report doesn’t define—use of AI and automation saw breach costs of $3.84 million.

If AI provides such clear benefits, why aren’t more firms “extensive” users of AI? Well, Coraggio points to cost, both in terms of talent acquisition and building systems internally. While it should be noted that IBM is a provider of cybersecurity tools and a managed services provider, the report does state that the number of organizations facing a “critical lack of skilled security workers rose dramatically to 53% in 2024 compared to 42% last year.”

Omdia’s Montenegro notes, though, that he’s seeing firms take an intentionally cautious approach to AI. Whether using Bayesian models for spam filtering or using machine learning for anti-fraud and anomaly detection, AI has been used in cyber defense for a long time in finance.

Though advancements in deep-learning models and generative AI are re-shaping this long history, they bring greater concerns around the cost of entry and potential regulatory oversight than before.

“There’s some promising work about enhancing the performance of security operations teams,” Montenegro says, “but the reality is that implementations are always more complex than demos, and teams need time to find how to make it work for them.”

More on Data Management

New working group to create open framework for managing rising market data costs

Substantive Research is putting together a working group of market data-consuming firms with the aim of crafting quantitative metrics for market data cost avoidance.

29 Aug 2024

Off-channel messaging (and regulators) still a massive headache for banks

Waters Wrap: Anthony wonders why US regulators are waging a war using fines, while European regulators have chosen a less draconian path.

28 Aug 2024

Back to basics: Data management woes continue for the buy side

Data management platform Fencore helps investment managers resolve symptoms of not having a central data layer.

27 Aug 2024

‘Feature, not a bug’: Bloomberg makes the case for Figi

Bloomberg created the Figi identifier, but ceded all its rights to the Object Management Group 10 years ago. Here, Bloomberg’s Richard Robinson and Steve Meizanis write to dispel what they believe to be misconceptions about Figi and the FDTA.

23 Aug 2024

SS&C builds data mesh to unite acquired platforms

The vendor is using GenAI and APIs as part of the ongoing project.

22 Aug 2024

Aussie asset managers struggle to meet ‘bank-like’ collateral, margin obligations

New margin and collateral requirements imposed by UMR and its regulator, Apra, are forcing buy-side firms to find tools to help.

20 Aug 2024

Where have all the exchange platform providers gone?

The IMD Wrap: Running an exchange is a profitable business. The margins on market data sales alone can be staggering. And since every exchange needs a reliable and efficient exchange technology stack, Max asks why more vendors aren’t diving into this space.

19 Aug 2024

Reading the bones: Citi, BNY, Morgan Stanley invest in AI, alt data, & private markets

Investment arms at large US banks are taken with emerging technologies such as generative AI, alternative and unstructured data, and private markets as they look to partner with, acquire, and invest in leading startups.

15 Aug 2024