Whose data is it, anyway?

The issue of data ownership may be obscure, but has important consequences for firms considering alternative data models, or firms looking to commercialize their in-house pricing or other resources. So ask yourself some serious questions: Who owns ‘your’ data? And why does it matter?

It sounds obvious, but do you know whether you own your data? When your firm sends a quote or an order to a broker or exchange, whose property is that quote or order, and what rights does it give them? What can they legally do with it (or not) and charge for it? Do you know? Are you 100% sure? And even if you know whether you own it or not, actually owning it isn’t a given.

Confused? You’re not alone. It’s an area that has in the past been rife with uncertainty and assumptions, but one where the ramifications have not been costly or disruptive enough to warrant spending the time or money required to establish watertight controls. However, as financial firms seek to monetize more of their internal data to buy-side clients, and establish less costly alternatives to exchange data feeds, uncertainty around data ownership could lead to more serious disputes

frank-desmond
Frank Desmond

“It depends very much on the legal controls you put around the data. Firms may grant access to their data, or they may have to hand it over as a condition of participating on an exchange. But simply because you can access the data doesn’t mean you can collect it, market it, and sell it with impunity. You have to work backwards to determine who has rights to every piece of quote and order data,” says Frank Desmond, managing director at data advisory firm FXD Data, and the former head of TP Icap Information, the broker’s data arm.

But even then, Desmond says, confusion frequently remains around ownership, which breeds conflict between participants. “Organizations can be very defensive about this because no one has 100% certainty about some of these issues,” he says.

While much of the ownership issue is driven by commercial factors—indeed, in many cases, practitioners find it hard to address the ownership and cost issues separately—there are other drivers. These include data privacy and firms’ attempts to ensure that competitors cannot reverse engineer the identity of the firm behind specific quotes and trades, which could allow those rivals to trade against them.

“Everyone is always trying to reverse engineer other peoples’ algorithms to get ahead of them,” so firms are increasingly placing greater value on protecting their data and their rights, says Kelvin To, founder and president of big data advisory firm Data Boiler Technologies.

derek-lacarrubba-schulte-roth-zabel
Derek Lacarrubba

Derek Lacarrubba, special counsel at law firm Schulte Roth & Zabel, who advises broker-dealers and hedge funds on regulatory issues, says this is common practice. “Funds already use multiple executing brokers to camouflage their activity. It’s standard practice to not give a single entity access to your whole order history. But you need a lot of scale to do that, so the opportunity to have multi-prime relationships is limited for smaller funds,” he says.

“We have heard some concerns expressed by buy-side firms about how their data is used by brokers, but it’s not widespread. The main concern we’ve seen is that anonymized data is not as anonymous as you may think—that is, that with certain assumptions, you can determine who is behind a trade and guess their strategies, then use that to game them,” Lacarrubba says.

‘Proof’ of ownership

It’s this concern—rather than being motivated to establish ownership of data for cost or revenue reasons—that motivated startup agency broker-dealer Proof Trading to address the data ownership issue recently, explicitly stating that clients own their data in the broker’s contracts.

“One of our pilot clients asked us what would happen if Proof ever got acquired by one of the large trading firms like Virtu, and that firm would then all own the trading data of our client,” says Daniel Aisen, CEO of Proof. “They were worried that if someone can understand your positions and when you put on and take off a position, then they can detect the patterns in your trading. And once they see that pattern starting, they can pre-position themselves ahead of that to take advantage of it.”

Nervous traders might be equally concerned that one of Proof’s management might “go rogue” and abscond with data, then start a hedge fund to trade against its clients. Aisen stresses that Proof has no plans to sell or start a hedge fund. Nevertheless, it responded to the concerns by creating a policy that requires clients to explicitly opt in to any usage or analysis performed by Proof, allows them to delete their data—aside from records that Proof is required to keep in “cold storage” for regulatory compliance purposes—and promises that the broker won’t use their data to create commercial data products.

Proof analyzes trading activity and creates execution reports for clients. Under the new policy, clients need to opt in to continue receiving those reports, or can opt out if they don’t want their data used in reports.

“If clients opt out, we won’t be able to generate reports for them. We do think that we add a lot of value, and our hope is that most people will want us to analyze their data,” says Proof president Allison Bishop. “If everyone opted out, we would lose that ability. But we don’t think that’s very likely. We’re self-imposing a burden that will be a trade-off. But we think it’s a better position. It puts clients in a position to drive the value we can provide for them. It lets them choose.”

daniel-aisen-proof-trading
Daniel Aisen

Aisen says the response has been “pretty modest,” as data ownership is seen as a “nice-to-have” compared to other priorities, but that he expects enthusiasm to grow as the issue gains recognition.

“We think this is an important issue. It’s not a hot topic yet, but we think it should be, and we want to be out in front of that. And we think regulators and others should be looking more closely at it,” he says.

While Proof’s initiative may not make a huge impact immediately because of the broker’s early-stage status, it may spark a greater appreciation of the issue overall, which is not widely well understood.

Don’t assume the ‘obvious’

In fact, there were significant differences in opinion between several data experts interviewed for this article. Some assumed ownership, others assumed others owned it, while still others asserted ownership claims but were unable to point to exactly where that ownership is set out in black and white.

Perhaps one reason for the confusion is that data ownership is often buried in the contracts signed by trading firms that allow them to participate on exchanges, but which may not involve data professionals who are well-versed in data governance issues in the process.

Suzanne Lock is CEO of UK-based consultancy EOSE, which helps data sources commercialize their data, and helps potential consumers and distributors identify suitable datasets for their needs. She has seen contractual issues firsthand from both sides of the fence, having spent 13 years at inter-dealer broker Tradition. Part of the issue is that no one “owns”—that is, actively takes responsibility for—the data that they own, so others reap the benefits unchecked.

“Heads of desk who sign trading agreements may not know about ownership or commercial issues—and probably aren’t in a position to assert something—while market data teams are overwhelmed with dealing with inbound data and can’t think strategically about creating a profit center,” she says.

Mauro Viskovic, a partner and corporate and securities lawyer at law firm Weiss Zarett Brofman Sonneklar & Levy, reports similar concerns. “On the trader side, I think they don’t know about this,” he says. “They assume ‘the obvious’ but the contract may say otherwise or may even say nothing. A lot of firms may not even have these contracts reviewed by attorneys.”

Those data executives who are aware of the ownership question may view it as a cost issue, rather than—or perhaps ignoring—the governance concerns. For example, trading firms have long complained that their quotes and orders create the liquidity that make exchanges successful, but that the exchanges then charge them to receive the data they created. Exchanges counter that they aren’t charging firms for their own data, but rather are charging for the service they provide of consolidating market-wide data.

That argument goes on. But for firms seeking to lower the growing burden of exchange data fees by leveraging peer-to-peer networks—where market participants make datasets available for free or at a nominal cost—the issue of whether they own the data they want to contribute, and what they can do with it, becomes a major issue and potential barrier.

One such P2P network is Pyth Network, which is building a decentralized, on-chain system of data from exchanges and trading firms. But an early challenge for Pyth was where its data would come from.

“If you need financial market data on-chain, where does all that come from? Because off-chain market data comes from a relatively small number of sources. And our view was that it’s going to be a stretch to see exchanges like CME making all their data available on-chain. So we scratched our heads and said, ‘Where’s that data going to come from?’” says Michael Cahill, a director at the Pyth Data Association arm of Pyth Network. Cahill is part of the Special Projects team at Jump Crypto, the cryptocurrency arm of Jump Trading, a contributing member of Pyth.

For Pyth, that data comes from a coalition of firms that make up the bulk of liquidity in the US markets, including Jump, the Chicago Trading Company, Flow Traders, Jane Street, Susquehanna International Group, Two Sigma Securities, and Virtu Financial, among others, as well as exchanges IEX, Memx, and the Miax-owned Bermuda Stock Exchange. Between them, these firms’ trading provides an accurate representation of market activity, while the exchanges provide “a pretty representative” best bid and offer price, Cahill says.

But these firms can’t just contribute any old data—literally, they may legally not be allowed to contribute data that they already send to other parties. An executive at one contributing Pyth member who requested anonymity described the challenge of identifying what they could and couldn’t submit. “We went through our contracts, and it became clear that we would be in violation of our agreements with exchanges if we shared the bids and offers that we submit to exchanges because they have exclusive rights to that data,” the executive says.

Though it seems counterintuitive that exchanges would claim ownership of data that exists before it’s even submitted to them—not to mention, contrary to what exchanges say about the topic—the executive is emphatic that’s what firms sign up to. “I assure you, we don’t own it,” he says.

And, according to this executive, the loophole that enables Pyth to exist is equally counterintuitive: “There’s one piece of data they don’t own—and that’s why Pyth exists: When a trading participant executes a trade, they can make that data available to anyone, and the exchange can also make it available. So, firms that trade thousands of times per second can create a very accurate approximation of US market data based on their trades,” he says.

Pyth hasn’t received any push-back from exchanges because, although it creates “a new competitive landscape” for basic market data, it doesn’t compete directly with exchanges’ main revenue-generating data products. “This is a new distribution channel, and one that is entirely different from traditional channels—it’s on-chain with smart contracts. And it’s published at 400-millisecond updates, which might as well be two weeks in co-location timeframes. We’re not competing with that space … so we’re not yet an existential threat to the exchanges’ current off-chain businesses,” he says, adding that other exchanges have expressed interest in participating in Pyth.

The exchanges’ definition

One possible reason for the lack of any push-back so far is that the exchanges disagree about ownership—though in a way that actually benefits the user firms.

“I don’t think we would challenge the statement that Pyth [and its members] own their trade data,” says a senior official at one US-based exchange, who calls it “an interesting real-life experiment to see if market data consumers will find value from it,” adding that he expects initiatives like Pyth to “complement the high transparency of exchange market data.”

However, while not contradicting the Pyth member’s statement, the senior exchange official’s description of ownership is in contrast with the trading firm’s assertion that exchanges own its quote and order data.

“In general, with respect to an order instruction, that is the property of the originator. But in signing agreements, they grant a perpetual, non-exclusive license to the exchange, allowing it to perform any number of tasks. So exchanges don’t own a firm’s order, but they can use it to, for example, create market data that the exchange does own,” the official says.

Part of that ownership is because the resulting aggregated data is not the same as the original members’ data, but also because the exchange performs a variety of tasks and services that add value.

“When we receive order instructions from members, once that hits our systems, we can use that information to run the exchange and create market data,” the exchange official says. “When we receive an order instruction, we process it, determine the effect on liquidity, and perhaps we even reject it—for example, if the stock is short-sale restricted. There are a bunch of things that could happen. So what comes in the front door is not what goes out to members—we’re creating that.”

A market data executive at another US exchange concurs: “Our underlying premise is that subscribers own their own data. They retain ownership of that and the rights associated with their data.” But though that original data remains the property of the member firm, once it reaches the exchange, “We can do what we want with that data so long as we don’t ‘out’ participants—that is, that we don’t display their market participant ID (MPID) along with their quotes or trades. That aggregation process is important, because if we do something that drives trading away from the exchange, that hurts us—so we want to drive as much transparency as possible,” he adds.

And while the data exec says he hasn’t received any client demand to change that arrangement—though the exchange is “ready, willing, and able to engage with customers,” he adds—he encourages participants to understand the implications of data ownership issues. “You see many clients trying to make money out of data, and they certainly should be getting educated about data as a business, and what they own and how they can use it.”

In general, exchanges—perhaps because they are more open to scrutiny about what data they own and what they can do with it—are clearer about ownership, and typically set out their terms within their services agreements that govern firms’ participation on an exchange.

However, things become less clear when dealing with inter-dealer brokers, since they may strike customer-by-customer agreements, whereas an exchange would have the same agreement with all participants. “So [with IDBs] people with different opinions may both be right,” the senior exchange exec says. “It’s probably not that people are confused, per se, but just that diversity exists.”

When it comes to brokers, Viskovic believes execution data is fair game for brokers to claim ownership, but believes that in many cases ownership remains unspecified, and advises that firms should demand that their ownership of their own quote and order data be recognized in contracts.

Data Protection

“If I were representing a trading firm, I’d rather not leave it to chance, and I’d ensure that brokers are not exercising rights over order data—only execution data. But if that’s not in the contract, and if I were a broker wanting to monetize that data in some way, I’d be cautious about that. I think they would need to set express conditions,” he says. “I’ve never had a broker object to revising their standard contract, but their standard contracts either don’t address it, or—as in a couple that I’ve seen—might suggest ownership.”

Of course, “suggest” isn’t the strongest legal term to rely on in the event of a dispute. And Viskovic couldn’t recall any recent lawsuits establishing or disputing data ownership. However, a precedent does exist—at least for data after it’s been submitted to and consolidated by exchanges. The decision dates back to the US Supreme Court in 1905, when the Chicago Board of Trade sued Christie Grain and Stock Company to prevent the latter from accessing quote and trade data from its wheat, corn, and provisions trading pits. Specifically, CBot asserted ownership over the data created via floor trading in its pits, which was then distributed to authorized firms via telegraph, and which the exchange sought to prevent from being freely available to “bucket shops” without a contract in place.

However, that decision—which, while still relevant, may not be the best benchmark for a marketplace that has evolved significantly over the intervening 116 years—still does not clarify ownership of the data prior to consolidation and redistribution. That may fall to individual contracts and contract law, says Christopher Mohr, senior vice president for intellectual property and general counsel at industry body the Software and Information Industry Association (SIIA), which includes data industry association FISD.

“The raw data itself—such as the security, and the amount traded—is not protected by copyright law … so it depends on what kind of business relationship a broker or exchange can come up with to get revenues from data, and how they can enforce it,” he says. “So the owner of the data may look to other ways to protect its rights, such as via terms of service agreements relating to data access, for example.”

In fact, Mohr warns that over-zealous attempts to assert ownership over grey areas may only stifle innovation and lead to more disputes.

“What we are seeing now among owners of data is a realization that the data they have is quite valuable. Investment is increasing across technology and services, and the data that feeds these engines is incredibly valuable … and there will be more fights over the data that creates that value,” he says.

Data Boiler’s To acknowledges that copyright is not currently used to protect data rights, but says it could serve as a model for data. “Other industries around the world embrace copyright licensing systems, and I think the time is right for the financial markets to look at this,” he says. “The beauty of a copyright licensing system is that it aligns rights with obligations, so if your data is used to create some kind of market manipulation, you should be held responsible for it.”

Like Mohr, To also warns of consequences if the issue isn’t addressed. Establishing proper ownership protections could end up growing not only the slice of the pie but the overall pie, To says. But he believes that an inability to protect and establish ownership of data in traditional markets will push traders into alternative markets, such as cryptocurrencies—another focus of Pyth Network—that potentially offer more protection for their data.

FXD Data’s Desmond also notes the recent investment in new technologies, such as Pyth and blockchain, to create new markets and “rewire” existing ones. But he warns that this has thus far ignored the elephant in the room of data ownership—even though some of these technologies have unique capabilities that could be applied to the challenge.

“A lot of people are investing significant amounts of money into finding better ways to rewire the marketplace. But on IP rights, they’ve been very conservative so far. Typically, they’re rewiring existing businesses with new technologies, but they’re not really changing the fundamentals,” Desmond says.

He adds, however, that he expects that to change as firms start to associate data ownership with bottom-line opportunities. “People are focusing on technology, but legal rights, controls and IP will become more relevant—especially if firms think it can add value.”

Viskovic says he’s already starting to see signs of change, reflecting an increased recognition of the importance of data ownership and governance. “It’s an issue that I don’t think is addressed seriously enough,” he says. “That said, I think it’s starting to be taken more seriously, driven by economic trends, especially around monetizing data. Now, people automatically think of data as an asset.”

It’s this approach that may ultimately prove sufficient incentive for firms to assert their rights and rewrite—or in many cases, write for the first time—contracts that govern how their data is used.

‘Lock’ down your rights

“I think change will come from bank initiatives to commercialize proprietary datasets,” EOSE’s Lock says. “Once you put a contract in front of people, they suddenly get very excited about their rights and responsibilities … even if they’ve been giving this away for decades with no controls in place.”

suzanne-lock-eose
Suzanne Lock

For example, in the case of regional banks that may dominate the market in specific local, and perhaps illiquid, currencies or securities, their data effectively constitutes the market in those assets. They quite literally “own” the market and its data, and hence their pricing has a high inherent value to those who make money from redistributing it.

Of course, as Lock says, it shouldn’t take the promise of potential revenues to make people take data ownership seriously. It should be considered good business practice, especially for avoiding unforeseen exposures.

“Whether a piece of data is fee-liable or not, it’s all about governance. As a provider of data, you should have governance and terms about ownership in place,” Lock says. Before drawing up any commercial terms, EOSE has clients complete a detailed questionnaire, compiled from templates used by the Alternative Investment Management Association and individual banks’ due diligence questionnaires. This covers everything from how a dataset is created and how its underlying data is sourced to whether the provider has compliance staff, and how it handles specific data types, such as personally identifiable data. If a company can’t answer all the questions, it’s not ready to provide its data, she says, regardless of whether it’s free or fee-liable.

“We tell clients that they need to have governance over how their data is used, or people may use it in unintended ways, such as to create a tradable instrument, or for settlement. That creates inherent responsibilities that you didn’t know you had—for example, to ensure the data is fit for purpose, such as whether it’s an observed and executable price or an individual’s evaluation,” she says. “Yes, the user should be responsible and tell you how it’s being used, but you also have some responsibility—and you don’t want your data falling into the hands of your competitors.”

In short, whether you make money from your data or not—or if you plan to in the future—you should protect it. If your data is used by other parties, you should protect it—and yourself—lest it be used in a way that creates liabilities for your firm. And if you use data originating somewhere else, you should check who owns it, and who’s allowed to use it. Markets are changing, and as data becomes more valuable, firmly establishing who owns that data and has rights to use it and charge for it will become increasingly important.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

‘Feature, not a bug’: Bloomberg makes the case for Figi

Bloomberg created the Figi identifier, but ceded all its rights to the Object Management Group 10 years ago. Here, Bloomberg’s Richard Robinson and Steve Meizanis write to dispel what they believe to be misconceptions about Figi and the FDTA.

Where have all the exchange platform providers gone?

The IMD Wrap: Running an exchange is a profitable business. The margins on market data sales alone can be staggering. And since every exchange needs a reliable and efficient exchange technology stack, Max asks why more vendors aren’t diving into this space.

Most read articles loading...

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here