• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

Trendingnow

1

CEO says anyone who works from home is grabbing groceries or at the vet 30% of the time—and shows off his busy office at Friday 5 p.m. to prove it

2

Ohio city workers are covering automated license plate readers with trash bags as officials sound the alarm on 'egregious violations' of privacy

3

A single new sentence in SpaceX's amended IPO filing could signal the biggest merger in history

1

CEO says anyone who works from home is grabbing groceries or at the vet 30% of the time—and shows off his busy office at Friday 5 p.m. to prove it

2

Ohio city workers are covering automated license plate readers with trash bags as officials sound the alarm on 'egregious violations' of privacy

3

A single new sentence in SpaceX's amended IPO filing could signal the biggest merger in history
LeadershipAI
Europe

A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

By
Stephen Pastis
Stephen Pastis
Down Arrow Button Icon
By
Stephen Pastis
Stephen Pastis
Down Arrow Button Icon
August 30, 2023, 12:43 PM ET
Jaap Arriens—NurPhoto/Getty Images

It all started with an email James Zou received.

Recommended Video

The email was making a request that seemed reasonable, but which Zou realized would be nearly impossible to fulfill.

“Dear Researcher,” the email began. “As you are aware, participants are free to withdraw from the UK Biobank at any time and request that their data no longer be used. Since our last review, some participants involved with Application [REDACTED] have requested that their data should longer be used.”

The email was from the U.K. Biobank, a large-scale database of health and genetic data drawn from 500,000 British residents, that is widely available to the public and private sector. 

Zou, a professor at Stanford University and prominent biomedical data scientist, had already fed the Biobank’s data to an algorithm and used it to train an A.I. model. Now, the email was requesting the data’s removal. “Here’s where it gets hairy,” Zou explained in a 2019 seminar he gave on the matter. 

That’s because, as it turns out, it’s nearly impossible to remove a user’s data from a trained A.I. model without resetting the model and forfeiting the extensive money and effort put into training it. To use a human analogy, once an A.I. has “seen” something, there is no easy way to tell the model to “forget” what it saw. And deleting the model entirely is also surprisingly difficult.

This represents one of the thorniest, unresolved, challenges of our incipient artificial intelligence era, alongside issues like A.I. “hallucinations” and the difficulties of explaining certain A.I. outputs. According to many experts, the A.I. unlearning problem is on a collision course with inadequate regulations around privacy and misinformation: As A.I. models get larger and hoover up ever more data, without solutions to delete data from a model — and potentially delete the model itself — the people affected won’t just be those who have participated in a health study, it’ll be a salient problem for everyone. 

Why A.I. models are as difficult to kill as a zombie

In the years since Zou’s initial predicament, the excitement over generative A.I. tools like ChatGPT has caused a boom in the creation and proliferation of A.I. models. What’s more, those models are getting bigger, meaning they ingest more data during their training.

Many of these models are being put to work in industries like medical care and finance where it’s especially important to be careful about data privacy and data usage.

But as Zou discovered when he set out to find a solution to removing data, there’s no simple way to do it. That’s because an A.I. model isn’t just lines of coding. It’s a learned set of statistical relations between points in a particular dataset, encompassing subtle relationships that are often far too complex for human understanding. Once the model learns this relationship, there’s no simple way to get the model to ignore some portion of what it has learned.

“If a machine learning-based system has been trained on data, the only way to retroactively remove a portion of that data is by re-training the algorithms from scratch,” Anasse Bari, an A.I. expert and computer science professor at New York University, told Fortune.

The problem goes beyond private data. If an A.I. model is discovered to have gleaned biased or toxic data, say from racist social media posts, weeding out the bad data will be tricky.

Training or retraining an A.I. model is expensive. This is particularly true for the ultra-large “foundation models” that are currently powering the boom in generative A.I. Sam Altman, the CEO of OpenAI, has reportedly said that GPT-4, the large language model that powers its premium version of ChatGPT, cost in excess of $100 million to train.

That’s why, to companies developing A.I. models, a powerful tool that the U.S. Federal Trade Commission has to punish companies it finds have violated U.S. trade laws is scary. The tool is called “algorithmic disgorgement.” It’s a legal process that penalizes the law-breaking company by forcing it to delete an offending A.I. model in its entirety. The FTC has only used that power a handful of times, typically directed at companies who have misused data. One well known case where the FTC did use this power is against a company called Everalbum, which trained a facial recognition system using people’s biometric data without their permission.

But Bari says that algorithmic disgorgement assumes those creating A.I. systems can even identify which part of a dataset was illegally collected, which is sometimes not the case. Data easily traverses various internet locations, and is increasingly “scraped” from its original source without permission, making it challenging to determine its original ownership.

Another problem with algorithmic disgorgement is that, in practice, A.I. models can be as difficult to kill as zombies. 

“Trying to delete an AI model might seem exceedingly simple, namely just press a delete button and the matter is entirely concluded, but that’s not how things work in the real world,” Lance Elliot, an A.I. expert, told Fortune in an email. 

A.I. models can be easily reinstated after deletion because it’s likely other digital copies of the model exist and can be easily reinstated, Elliot writes.

Zou says that, the way things stand, either the technology needs to change substantially so that companies can comply with the law, or lawmakers need to rethink the regulations and how they can make companies comply.

Building smaller models is good for privacy

In his research, Zou and his collaborators did come up with some ways that data can be deleted from simple machine learning models that are based on a technique known as clustering without compromising the entire model. But those same methods won’t work for more complex models such as most of the deep learning systems that underpin today’s generative A.I. boom. For these models, a different kind of training regime may have to be used in the first place to make it possible to delete certain statistical pathways in the model without compromising the whole model’s performance or requiring the entire model to be retrained, Zou and his co-authors suggested in a 2019 research paper.

For companies worried about the requirement that they be able to delete users data upon request, which is a part of several European data privacy laws, other methods may be needed. In fact, there’s at least one A.I. company that has built its entire business around this idea. 

Xayn is a German company that makes private, personalized A.I. search and recommendation technology. Xayn’s technology works by using a base model and then training a separate small model for each user. That makes it very easy to delete any of these individual users’ models upon request.

“This problem of your data floating into the big model never happens with us,” Leif-Nissen Lundbæk, the CEO and co-founder of Xayn, said. 

Lundbæk said he thinks Xayn’s small, individual A.I. models represent a more viable way to create A.I. in a way that can comply with data privacy requirements than the massive large language models being built by companies such as OpenAI, Google, Anthropic, Inflection, and others. Those models suck up vast amounts of data from the internet, including personal information—so much that the companies themselves often have poor insight into exactly what data is contained in the training set. And these massive models are extremely expensive to train and maintain, Lundbaek said. 

Privacy and artificial intelligence businesses are currently a sort of parallel development, he said. 

Another A.I. company trying to bridge the gap between privacy and A.I. is SpotLab, which builds models for clinical research. Its founder and CEO Miguel Luengo-Oroz previously worked at the United Nations as a researcher and chief data scientist. In 20 years of studying A.I., he says he  has often thought about this missing piece: an A.I.’s system’s ability to unlearn.

He says that one reason little progress has been made on the issue is that, until recently, there was no data privacy regulation forcing companies and researchers to expend serious effort to address it. That has changed recently in Europe, but in the U.S., rules that would require companies to make it easy to delete people’s data are still absent.

Some people are hoping the courts will step in where lawmakers have so far failed. One recent lawsuit alleges OpenAI stole “millions of Americans'” data to train ChatGPT’s model.

And there are signs that some big tech companies may be starting to think harder about the problem. In June, Google announced a competition for researchers to come up with solutions to A.I.’s inability to forget.

But until more progress is made, user data will continue to float around in an expanding constellation of A.I models, leaving it vulnerable to dubious, or even threatening, actions.

“I think it’s dangerous and if someone got access to this data, let’s say, some kind of intelligence agencies or even other countries, I mean, I think it can be really be used in a bad way,” Lundbæk said.

About the Author
By Stephen Pastis
LinkedIn iconTwitter icon
See full bioRight Arrow Button Icon

Latest in Leadership

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • World's Most Admired Companies
  • See All Rankings
  • Lists Calendar
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • About Us
  • Press Center
  • Work At Fortune
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Leadership

‘The next China is still China’: McKinsey’s Joe Ngai and Nick Leung on why global business can’t write off the Chinese economy
AsiaAsia Agenda
‘The next China is still China’: McKinsey’s Joe Ngai and Nick Leung on why global business can’t write off the Chinese economy
By Nicholas GordonJune 4, 2026
6 hours ago
A maintenance worker walks past the company logo on the side of a locomotive in the Union Pacific Railroad fueling yard in north Denver, Oct. 18, 2006.
North AmericaDonald Trump
Union Pacific CEO responds to Trump idea for U.S. stake in $71.5 billion railroad mega merger: ‘We do not need anybody’s help to do this’
By Jordan BlumJune 4, 2026
8 hours ago
John Furner
SuccessCareers
Walmart CEO John Furner worked his way up from the garden center. After 30 years, he’s sharing the one trait that matters most in his job
By Preston ForeJune 4, 2026
11 hours ago
Isolated Gen Z worker in office
SuccessGen Z
Gen Zers are more disconnected and distrustful of coworkers than their older colleagues—and they’re so lonely they’re taking days off work
By Emma BurleighJune 4, 2026
12 hours ago
jd
BankingBubbles
Jamie Dimon sees ‘gung-ho’ attitude and ‘exuberance’ in markets—just like 1972, 1986, 2000 and 2007. Uh Oh.
By Nick LichtenbergJune 4, 2026
14 hours ago
gg
Environmentprotests
Albanian protesters are furious about a giant development on a virgin beach that Jared Kushner and Ivanka Trump discovered on vacation
By Zana Cimili and The Associated PressJune 4, 2026
14 hours ago

Most Popular

CEO says anyone who works from home is grabbing groceries or at the vet 30% of the time—and shows off his busy office at Friday 5 p.m. to prove it
Success
CEO says anyone who works from home is grabbing groceries or at the vet 30% of the time—and shows off his busy office at Friday 5 p.m. to prove it
By Orianna Rosa RoyleJune 4, 2026
20 hours ago
Ohio city workers are covering automated license plate readers with trash bags as officials sound the alarm on 'egregious violations' of privacy
Cybersecurity
Ohio city workers are covering automated license plate readers with trash bags as officials sound the alarm on 'egregious violations' of privacy
By Sasha RogelbergJune 3, 2026
2 days ago
A single new sentence in SpaceX's amended IPO filing could signal the biggest merger in history
Startups & Venture
A single new sentence in SpaceX's amended IPO filing could signal the biggest merger in history
By Shawn TullyJune 4, 2026
20 hours ago
10,000 Boomers a day, $39 trillion in debt, and no benefit cuts: Bessent stakes Social Security on the Trump economy
Economy
10,000 Boomers a day, $39 trillion in debt, and no benefit cuts: Bessent stakes Social Security on the Trump economy
By Nick LichtenbergJune 4, 2026
12 hours ago
Current price of oil as of June 4, 2026
Personal Finance
Current price of oil as of June 4, 2026
By Joseph HostetlerJune 4, 2026
14 hours ago
Teens are up against the worst summer job market in nearly 80 years—they’re fighting against hundreds to work at ice cream shops and swimming pools
Success
Teens are up against the worst summer job market in nearly 80 years—they’re fighting against hundreds to work at ice cream shops and swimming pools
By Emma BurleighJune 2, 2026
3 days ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.