Mad Scientists Creating Black Swans in a Lab

blackswan

Black swan events are every capacity manager's nightmare. You can build out your system to accommodate maximum data usage, and for 299 straight days that system will work perfectly. Then comes that damned black swan, in the form of a sudden spike in usage, and the system gets overwhelmed.

At a panel on capacity at the North American Trading Architecture Summit in New York last week, part of the discussion was on the different approaches to handling the spike days in an industry where message volumes are rising across the board.

"It's really not sexy; you just put in this buffer," says Howard Halberstein, lead solutions architect on Deutsche Bank's Unix business. "It's all you can do because you're not an on-demand shop. You can't immediately spin up storage, you can't spin up capacity, as if you're Netflix. ... [Netflix is] a complete public cloud dream; we're constantly burning 64, 128 CPUs on systems that are waiting for a spike."

One solution for better quantifying the degree of spike and how much storage it will suck up is to test for them in advance.

"Often simulating black swan events or spike days is quite hard to do because the nature of those kind of spikes is not a regular set of data over a period of time," says Patrick Myles, CTO at UK-based tech vendor Caplin. "It's in very short time slices. Even getting environments that simulate production time, data or infrastructure can be a challenge in labs. One of the big challenges in scaling these kind of web applications is not pushing all the data out to the end users.

"In our experience, that gives a natural buffer point, which is, you don't have to worry about external bandwidth, and the front end capability in these spikes," Myles continues. "We're really focusing on the ability of the servers and the integration to handle these back-end loads and then disseminate just the right amount, prioritize the data, prioritize the trading. A lot of the lab situations are around making sure that core capability is there and taking away the problem of, OK, there's a spike and suddenly your bandwidth is 10 times what you've got commissioned from your connectivity provider."

Halberstein supports lab testing not necessarily for black swan events, but as a "sanity check" to understand exactly what an increase in order rates, cancel rates, or fills will do to your system.

One problem, argues Arsalan Shahid, program director of the Financial Information Forum, is that lab testing only goes so far without broad industry participation. Firms can get in a lab and pump up their own order volume but without multiple participants, a simulation is only worth so much.

"There's such a multitude of scenarios out there that make it difficult to be proactive," adds Peter Mager, CTO of Davidson Kempner Capital Management.

The Bottom Line
The Boy Scouts have it right ─ be prepared. It may seem like a waste to have all those servers sitting around doing nothing for 299 days, but on that 300th day, it's better to have them and not need them than need them and not have them.

Black swan events do happen when it comes to messaging spikes. All it takes is one bad day where capacity can't meet demand and the whole system looks like a failure. Customer trust is harder to earn back than the dollars wasted on buffer capacity.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

‘Feature, not a bug’: Bloomberg makes the case for Figi

Bloomberg created the Figi identifier, but ceded all its rights to the Object Management Group 10 years ago. Here, Bloomberg’s Richard Robinson and Steve Meizanis write to dispel what they believe to be misconceptions about Figi and the FDTA.

Where have all the exchange platform providers gone?

The IMD Wrap: Running an exchange is a profitable business. The margins on market data sales alone can be staggering. And since every exchange needs a reliable and efficient exchange technology stack, Max asks why more vendors aren’t diving into this space.

Most read articles loading...

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here