As data volumes explode, expect more outages

Waters Wrap: At least for those unprepared—though preparation is no easy task—says Anthony.

On Friday, August 2nd, CJC’s Steve Moreton logged onto his company’s MosaicOA analytics platform to see how the market was responding to a massive market sell-off.

“It was unprecedented,” he told me last week. “I just said, ‘Wow, I have never seen a number that high—ever.’”

The number he was talking about was the market data update rate. And “unprecedented” spikes tend to lead to system outages.

By the following Monday, with the sell-off still in full swing, multiple online brokerages—including Charles Schwab, Fidelity, and Vanguard—suffered outages and, as a result, investors were unable to trade during a tumultuous period. The New York Times, sourcing the website Downdetector, wrote that “more than 15,000 Schwab clients, as well as 3,700 users of Fidelity and 2,800 of Vanguard, had complained about access issues by midmorning” on the 5th.

On social media site X (formerly known as Twitter…ugh) Schwab said that “a technical issue” was responsible for the issue. Schwab would later tell Fast Company that “a combination of higher volumes and a technical issue with a key vendor” led to its outage. As best I can tell, everyone else has been relatively mum as to the reasons for their outages. If that’s the case, I think it’s reasonable to deduce that higher-than-normal volumes at least played a role in those other outages—whether for internal systems or third-party systems.

Automation and innovation—such as AI and APIs—have led to the generation of more market data than ever. It’s an inexorable climb that will not stop any time soon, if ever. As the amount of data flooding the market rises, it will continue to put pressure on trading systems and other platforms and infrastructure technologies.

“Market data update rates naturally double every two years,” said Moreton, who is global head of product management at CJC, which provides an assortment of services, including market data observability and monitoring.

He continued: “Even then, though, you could be having a nice, stable week, and suddenly you have an unforeseen market event that leads to a frenzy of trading. All of a sudden, those market data rates—which double every two years—well, now you’re two years into the future, today.”

For any company, it’s a double-edged sword. You can spend tens (if not hundreds) of millions of dollars upgrading systems to handle extreme spikes of activity and data. But these spikes are, by definition, rare. So maybe you can take a gamble? You could invest in something that will be considered legacy and outdated five years from now. Some will point to cloud and its elasticity as the silver bullet, but there are hidden cost concerns, and even the cloud providers are not immune to outages, which regulators and others worry could lead to systemic risk.

Whether because of rising data volumes, a pandemic, a cyber-attack, or a bug accidentally released during a run-of-the-mill software update, outages are increasingly roiling the markets. These outages are, in part, the reason the European Union passed the Digital Operational Resilience Act—a.k.a. Dora—which goes into effect in January. The sweeping regulation includes protocols for resilience testing and specifications for EU authorities to oversee vendors deemed systemically critical to the financial industry.

Most notably for this discussion, Article 9 of the regulation addresses capacity and performance management: “As part of the ICT [information and communication technology] security policies, procedures, protocols, and tools referred to in Article 9(2) of Regulation (EU) 2022/2554, financial entities shall develop, document, and implement capacity and performance management procedures for the following: the identification of capacity requirements of their ICT systems; the application of resource optimization; the monitoring procedures for maintaining and improving: the availability of data and ICT systems; the efficiency of ICT systems; the prevention of ICT capacity shortages.”

Dora is a complex and maligned overhaul, but to be sure, other regulators around the globe will likely draw upon Dora as they figure out their own philosophies on tech resiliency and oversight.

Unprecedented times (again)

In the US, a highly unusual presidential election will be held in November. Lingering recession concerns contributed to this month’s sell-off, and those won’t likely go away anytime soon. And, of course, there are always unforeseen events (Covid or presidential assassination attempts, for example) that lead to confusion and volatility in the markets.

When it comes to market data update rates, while the recent sell-off was “unprecedented” today, that will only be a fitting adjective until the next major market event unfolds. And as I wrote in my previous column, outages are an everyday concern for anyone working in IT. While there are certainly lessons to be learned from the CrowdStrike fiasco, some in the industry are skeptical that firms will heed those lessons. But there were teachable moments.

For Moreton, there are several things that technologists should keep in mind as market data rates keep soaring. First, even if you bought or built a piece of infrastructure technology only 18 months ago, you always have to question if it has become “legacy” because the pace of technological evolution is also rapidly increasing.

Second, observability is vital. While CJC has a horse in this race, I think it’s fair to say that firms cannot afford to set it and forget it. The amount of data and type of data flowing through your systems is constantly changing and growing, often under the radar. You need to know where you are today versus six months ago, and have a good idea of where you will be six months from now.

And finally, capacity issues can be built into contracts. If you rely on third-party providers (and, if you are reading this, I have to imagine you do), you have to monitor those agreements closely, or else you risk hitting a contractually stipulated ceiling that can cause a service disruption or a hefty, unexpected fee.

I’m going to finish this column with a quote from the chief information officer at a large global bank—not because I’m lazy (at least not right now), but because I think it should be seared into the minds of every data professional.

“We have become complacent—this [CrowdStrike outage] is a wake-up call. As technologists in a digitally interlinked world, we must remember our responsibility to manage operational risk. As Spider-Man taught us, with great power comes great responsibility.”

Think I’m missing something? I’d love to hear from you: anthony.malakian@infopro-digital.com.

The image accompanying this column is “The North Cape by Moonlight” by Peder Balke, courtesy of The Met’s open-access program.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

Where have all the exchange platform providers gone?

The IMD Wrap: Running an exchange is a profitable business. The margins on market data sales alone can be staggering. And since every exchange needs a reliable and efficient exchange technology stack, Max asks why more vendors aren’t diving into this space.

Most read articles loading...

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here