UK pushes financial firms to adopt privacy-enhancing technologies

Lawmakers and vendors are set to usher in an era of mutualized data insights. Are banks ready?

Need to know

  • UK regulators are trying to encourage adoption of privacy-enhancing technologies in financial markets by issuing implementation guidance, running tech sprints, and hosting workshops.
  • Most recently, the information commissioner recommended that all companies sharing large amounts of data consider using the technologies in the next 5 years.
  • Vendors and users say the technologies are already mature enough to fulfil multiple use cases, from anti-money laundering functions to processing sensitive patient data in healthcare.
  • But few UK firms are publicly discussing their use of the technologies. A lack of trust between institutions and perceived barriers created by financial and privacy regulations are still holding back widespread implementation.

With hindsight, 2019 will be remembered as a turning point for privacy-enhancing technology (PET) in the UK: the beginning of a push to drive adoption of what had previously been considered an emerging technology.

It started when the Royal Society, the UK’s national academy of sciences, published a report on the uses and limitations of PETs in data analysis. In a foreword, University of Oxford professor Alison Noble wrote that the report “captures a moment in time where the technologies are maturing and opportunities to use these technologies are beginning to emerge.”

Sure enough, the report proved to be the first of many projects undertaken by regulators, legislators, national bodies, and companies to demonstrate that the enticing promise of PETs—to enable the large-scale analysis of sensitive data while maintaining principles of data protection—was becoming a reality.

The same year, the Financial Conduct Authority (FCA) held a tech sprint showcasing potential uses of PETs in fighting financial crime. And the Information Commissioner’s Office (ICO), the UK’s data protection regulator, began incorporating PETs into its anonymization guidance, before coming out with standalone guidance on the use of PETs.

This renewed focus on PETs has continued into 2023, when information commissioner John Edwards told all companies sharing large amounts of data, “We recommend that over the next five years you start considering using PETs.”

And a new bill designed to introduce a more flexible data protection regime, facilitate innovation in data sharing, and enable the construction of data bridges with international partners was even mentioned by King Charles in his state opening of Parliament.

Banks that need to collaborate on tackling money laundering, identifying fraud, and generating synthetic data to test systems are already turning to PETs to preserve secrecy and ensure compliance with data protection regulations. Vendors and users of PETs say it is easier than ever before to implement these technologies in the UK, but warn that the hardest step will be getting financial institutions on board.

Ronen Cohen, vice president of strategy at PET vendor Duality Technologies, points to the example of the National Health Service’s federated data platform—including a tender specifically for a provider of privacy-enhancing technology—as a sign that that the government and industry understand the benefits that these technologies can bring.

“Once you start seeing government health agencies going out to public tender on privacy technologies, that means that it’s no longer an experiment. And when you see regulators like the ICO making these strong statements … PETs are entering the mainstream, not just in discussion, but in usage,” Cohen says.

The potential payoff is significant. Money laundering alone is estimated to cost the UK over $125 billion every year. The value of other opportunities created by financial institutions unlocking mutualized data insights is incalculable.

But change happens slowly in financial services in the best of times. And when it comes to sharing the data that gives them a competitive edge, financial institutions are often reluctant to take the lead. “When they do need to collaborate, it’s done over a phone call and in a back room. Moving from that to starting to use technology at scale—that’s a big cultural shift,” Cohen says.

Firm foundations

Paul Comerford, principal technology advisor of anonymization and encryption at the ICO has witnessed this growing interest in PETs firsthand. Originally a cybersecurity specialist at the ICO, he transitioned into a role focused on anonymization in 2019, just as efforts to refresh the anonymization guidance were bringing PETs into the spotlight.

Comerford explains that there are a few drivers for the increased attention to PETs. One is the introduction of the EU’s General Data Protection Regulation (GDPR) in 2018, which spurred organizations to take data protection more seriously. But he also says that there has been a surge in the progress of privacy-enhancing technologies over the last five years.

“Four or five years ago, some of them just weren’t ready. Computationally, they were too intensive, they would take too much time to run, there was too much overhead. And I think those problems are slowly going away with some PETs more than others. Also, standardization is starting to gain traction now with some of the PETs,” Comerford says.

The Centre for Data Ethics and Innovation (CDEI) launched a public repository of PET use cases in 2021, detailing how organizations are already harnessing PETs in industries from finance to transportation. The idea is that other companies hoping to implement PETs can find examples of specific technologies and methodologies that have solved similar problems.

The ICO’s guidance on PETs also points to a degree of standardization, listing the various data processing activities and the specific PETs or combinations of PETs that can help protect data privacy in each case. Sources say that such explicit guidance from regulators is rare and valuable.

But development and implementation can still be a challenge due to the relative lack of expertise and knowledge surrounding PETs. Nic Lane, a professor who leads the University of Cambridge Machine Learning Systems Lab, says beginning research into the technologies is still harder than it should be. When his group began looking into federated learning (FL) three years ago, they were quickly confronted with the limitations of existing frameworks.

“I realized that even to do some research in this space meant that you were coding a bunch of things up that should already exist,” Lane says.

The difficulties he encountered in his research led Lane to co-found Flower Labs, which provides an open-source framework to federate machine learning projects and other workloads. The original goal was just to enable his group to do the work that it wanted to do, but to his surprise, a community started to form around Flower Labs.

It’s probably 10 times easier right now for some Joe on the street to train a 7 billion–parameter LLM on readily available data than it is to build a federated system. And there’s no good reason for that.
Nic Lane, University of Cambridge

Lane says he believes the key to the further development of Flower Labs is the availability of software and well-executed frameworks. He points out that other potential limiting factors—such as regulatory confusion or the difficulty of assessing privacy—also exist for other types of machine learning, which are nonetheless thriving.

“It’s probably 10-times easier right now for some Joe on the street to train a 7 billion–parameter LLM on readily available data than it is to build a federated system. And there’s no good reason for that. Fundamentally, the problems are equally hard,” he says.

Similar to the meteoric rise of chatbots, Lane says he hopes the example of leading organizations and the availability of the necessary technology will cause momentum to build. “As soon as we can make it easy for people to do this, they will start doing it. There is regulatory confusion, jumpiness at banks, and lots of other things. And you could try to fix those first. But we take the opinion that if there’s enough value, and if you make it easy to do, then people at thousands of companies or in research labs will just start doing it,” he says.

PET peeves

PETs can be sorted broadly into two categories: software-based or hardware-based. Both still have their limitations. The hardware-based trusted execution environment (TEE) ensures that sensitive data is processed in a secure region of the CPU while remaining invisible to the operating system. But because the sensitive data is decrypted and processed in the clear within this enclave, most regulators do not consider this to be a privacy tool.

Software-based PETs, meanwhile, can be more time-consuming and computationally intensive, and often depend heavily on good governance and implementation.

Another difficulty lies in assessing the accuracy of the implementation in protecting the privacy of sensitive data. Until a successful attack or re-identification of anonymized data is carried out, it is hard to say exactly how well any PET or combination of PETs works. It is possible to measure how long the process takes to run, how complex the system is, and subject the system to attacks, but none of these can produce a reliable metric for comparison with alternative tools.

To this end, standards bodies are working to facilitate adoption of PETs by introducing technical guidelines for specific PETs and codifying the steps for assessing anonymization.

Professor Kai Rannenberg is the convenor of Working Group 5 within ISO’s information security standing committee, which is tasked with overseeing standards for identity management and privacy technologies.

His group has designed some standards for anonymization technologies, which can be used by businesses that need to safeguard the privacy of personal client data they are processing.

“A challenge that we’re dealing with is that often people want to know from us, ‘there’s a certain data anonymization technique, but how secure is it? How much data do I need to put into it to ensure anonymization?’ And that kind of assessment actually cannot be done in the general case, because it’s extremely context-dependent,” Rannenberg says.

The ICO's Comerford adds that certification schemes could eventually emerge alongside mature standards, allowing developers to Kitemark a PET for a particular use case, but that is a long way off.

Like Rannenberg, Duality’s Cohen recognizes that context and governance can have an outsized impact on the way in which a PET is implemented, and even the choice of PET. In fact, it is becoming more common for organizations to employ multiple PETs. When sharing sensitive data, for example, they could employ federated learning to minimize the information transferred between parties, along with homomorphic encryption to prevent parties from accessing the input information.

“Not everything can be fixed by a hammer or by screwdriver. And it used to be that privacy technology companies said, ‘This is the privacy technology to go with; it’ll solve everything.’ And I think everyone has seen that’s really not the case,” Cohen says. “And so you’re seeing privacy technology companies, Duality included, combining privacy technologies, either to support any type of computation, support new types of data, or support certain types of scale needs. That’s definitely a trend that’s happening. And you’ll actually see much more uptake as a result. It becomes more commercially feasible for organizations to take on because it solves more challenges.”

One company employing a combination of PETs is Tune Insight, a Swiss startup that completed a $3.4 million funding round last month. Using federated learning, homomorphic encryption, and secure-multiparty computation, Tune Insight helps hospitals collaborate with each other on multi-site analysis or with insurers on value-based healthcare without disclosing sensitive patient data. Due to the combination of techniques employed, all data that Tune Insight’s systems transfer is both encrypted and aggregated, meaning that any mutualized insights released at the end of the process cannot be linked back to the original records.

Juan Ramón Troncoso-Pastoriza, CEO of Tune Insight, says the same principles used in healthcare and insurance could transform anti-money laundering and fraud detection in financial institutions, reducing the current rate of false positives by 75% to 80% using machine learning-based systems working on collective data from multiple banks.

He gives the example of blacklist validations, which is when a bank wants to know whether an International Bank Account Number (Iban) was flagged in other banks’ networks without disclosing which bank flagged it or what score each bank gives the number. A system based on federated learning could simply retrieve information such as whether the number is on the blacklist of any banks or the average score given to the Iban by all the banks.

The challenge is that such a system requires buy-in from multiple institutions in order to construct a holistic view of transaction or client data.

“Normally, we talk directly with a hub,” Troncoso-Pastoriza says. “You need to create a critical mass of banks joining together in order to start having real value. … So talking with regulators, talking with the companies that do fraud detection for multiple banks—this is how we can actually serve a big network without having to onboard bank by bank.”

For now, though, financial institutions still seem to be waiting for successful examples before taking the plunge with their own sensitive data. “I would say that at the moment, PET adoption, unfortunately, is quite low. Some of the factors include the maturity, the complexity, the lack of understanding of how PETs could help with compliance and share data safely,” says the ICO's Comerford.

Legal and regulatory data handling requirements are often cited by compliance and data protection officers as reasons for the tentative adoption of PETs. But while regulation is clearly going in the direction of introducing more guarantees for the data subject, new stringent data rules should embolden firms by demonstrating more clearly how data can and cannot be processed.

“Data protection doesn’t prevent you from data sharing. It’s perceived as doing so, but it doesn’t. And there might be other perceived barriers to data sharing: customer preference, competitive advantage, trusting recipients, and financial regulations, for example,” Comerford says.

With the technological foundations in place and the regulatory landscape becoming clearer, the next task for proponents of PETs is to spread the word.

For its part, the ICO is working on publishing seven different use cases and organizing a workshop with customers and providers early in 2024 to demonstrate and discuss how PETs can be adopted safely.

But Comerford also offers a word of caution. For all their rapid progress in recent years, PETs are a long way from being able to do everything. “I don’t think they should be regarded as a silver bullet for all compliance needs. They will work very well for some types of data processing. But of course, there are some types of processing that may be unfair or unlawful, and the use of PETs is not going to get you over the line in those situations,” he says. “PETs are a very useful thing to have in the toolbox, but they’re not suitable for every use case.”

PETs and their uses

  • A trusted execution environment is a secure area inside a computing device’s CPU. It runs code and accesses information in a way that is isolated from the rest of the system. Applications running in the TEE can access information outside the TEE, but applications outside the TEE cannot access information in the TEE. Using a TEE gives you a higher level of trust in validity, isolation and access control in the information and code stored in this space, when compared to the main operating system. Therefore, this makes the applications running inside that space more trustworthy.
  • Homomorphic encryption allows you to perform computations on encrypted information without first decrypting it. The computations themselves are also encrypted. Once you decrypt them, the result is an output identical to what would have been produced if you had performed the computation on the original plaintext data. Homomorphic encryption uses a public key-generation algorithm to generate a pair of private and public keys, and an evaluation key. The evaluation key is needed to perform computations on the encrypted information when it is shared with the entity that will perform them. This entity does not need access to the private key to perform the analysis. The client, who retains the private key, can then decrypt the output to obtain the result they require. Any entity that has only the public and the evaluation keys cannot learn anything about the encrypted data in isolation.
  • Secure-multiparty computation is a protocol that allows at least two different parties to jointly process their combined information, without any party needing to share all of its information with each of the other parties. All parties (or a subset of the parties) may learn the result, depending on the nature of the processing and how the protocol is configured. It uses a cryptographic technique called ‘secret sharing’. Each participating party’s information is split into fragments to be shared with other parties. Each party’s information cannot be revealed to the others unless some proportion of fragments of it from each of the parties are combined. As this would involve compromising the information security of a number of different parties, in practice it is unlikely to occur. This limits the risks of exposure through accidental error or malicious compromise and helps to mitigate the risk of insider attacks.
  • Federated learning is a technique that allows multiple different parties to train AI models on their own information (‘local’ models). They then combine some of the patterns that those models have identified (known as ‘gradients’) into a single, more accurate ‘global’ model, without having to share any training information with each other.
  • Differential privacy is a property of a dataset or database providing a formal mathematical guarantee about people’s indistinguishability. It is based on the randomised injection of noise. Noise allows for plausible deniability of a particular person’s personal information being in the dataset. This means that it is not possible to determine with confidence that information relating to a specific person is present in the data.

*Source: Information Commissioner's Office guidance on privacy-enhancing technologies.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

‘Feature, not a bug’: Bloomberg makes the case for Figi

Bloomberg created the Figi identifier, but ceded all its rights to the Object Management Group 10 years ago. Here, Bloomberg’s Richard Robinson and Steve Meizanis write to dispel what they believe to be misconceptions about Figi and the FDTA.

Where have all the exchange platform providers gone?

The IMD Wrap: Running an exchange is a profitable business. The margins on market data sales alone can be staggering. And since every exchange needs a reliable and efficient exchange technology stack, Max asks why more vendors aren’t diving into this space.

Most read articles loading...

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here