Discover more from Gradient Ascent
Ctrl + Alt + Unlearn: Teaching AI to Forget
A narrative driven introduction to Machine Unlearning
One of the tiny humans I manage (or, rather, who manages me) decided to take sick leave this week. That delayed our regularly scheduled programming by a couple of days. Now, on to the content for this week.
LLMs are getting larger and more capable with each passing day. Just this Monday, OpenAI showed us their "LLM app store," where builders can create new GPT agents using plain English. The time is ripe for us to look at something just as important, if not more – teaching AI to forget. This area of research, called Machine Unlearning, addresses the question, "Can AI models forget specific data without catastrophically forgetting everything?". But there are several related questions, too. What are the implications of this research area? What's possible today, and what are the challenges of making a model selectively forget? We'll look at all these questions and more through a story-centric deep dive. This is something new I'm trying, so, as always, if you have feedback, questions, or comments, let me know through the form below or use the comment section. Let's go!
Ctrl + Alt + Unlearn: Teaching AI to Forget
Alex peered blearily at the code swimming across his double-monitor setup, waiting for the last of the unit tests to pass. The soft clicking of keyboards and the pattering rain rhythmically punctuated the silence at PatterNetics Inc. Midnight neared, and his tepid coffee was losing against the weight of his eyelids. Absently ruffling his disheveled hair, he glanced around to see weary colleagues, their features made more gaunt by the bluish glows of their screens. "We're like a bunch of owls, only we feast on bugs and bad takeout," he chuckled to himself.
His inbox pinged, bringing his gaze back to his screens. An email this late? "Oh… crap!" he exclaimed. Within seconds, pings sounded at adjacent desks, eliciting identical responses from his team.
The subject of the email read "URGENT: DATA PURGE REQUEST." That wasn't unusual.
The company handled hundreds of these requests each month, but this one was from AIFinn, one of their oldest and most significant customers.
They weren't just asking for a data purge. They wanted to end things, effective immediately.
PatterNetics, once a beacon of ethical innovation, had slowly spiraled into a data-hungry behemoth. Profit had eclipsed privacy. Expansion had silenced ethics. This was their unsung ethos.
Alex had seen the signs and raised alarms about data misuse and the ethical quicksand they were wading into. Enamored with their bottom line, the higher-ups dismissed and ignored his warnings.
The knot in his stomach tightened as he reread the email. AIFinn had discovered the unsanctioned use of their data in PatterNetics' flagship AI foundation model, the Oracle. Millions of customers used the Oracle daily. Promptly, AIFinn demanded that both their data and the model trained on it, the Oracle, be deleted. This domino would set off an avalanche of legal and financial ruin.
Alex's team had assembled around his desk, their expressions a mix of confusion and concern. Before he could address them, Frank, his supervisor and professional boardroom bobblehead, interrupted. "The execs want the trail wiped clean. Just delete the data, Alex."
Alex shook his head. "It's not that simple. Just erasing the data won't be enough. The Oracle has learned from it. It's part of its memory now."
Frank scoffed. "Zeros and ones, Alex. Delete it. Don't overstep."
But Alex stood his ground. "There's more at stake here. This is our chance to finally do the right thing. We need to update the Oracle properly."
Frank glared. "Your call, but remember, it's your neck on the line." With that, he stormed off, leaving Alex and his team to consider their future.
"Typical," whispered Jenna as Frank's footsteps receded into the hallway. "Alex, how did they find out that the Oracle was trained on their data?".
"My guess is as good as yours, Jenna," replied Alex. "We don't have time to worry about that now."
"Ok, team, what are our options?" continued Alex.
"How recently was AIFinn's data used for fine-tuning the Oracle? If so, we could find a checkpoint that doesn't use their data and deploy it", suggested James.
"No, they're one of our oldest customers. Their data was used since the beginning per my logs," sighed Alex as the soft glow from his screen cast deep shadows across his contemplative features.
"Really? I don't see it in any of my databases," said Jenna, puzzled.
"I'm not surprised. Upper management "cleanses" records periodically before the audits happen. I've been collecting receipts secretly," said Alex, resigned.
"Okay, so restoring a checkpoint is a non-starter. Can we take the pretrained model and repeat fine-tuning on it after removing their data," suggested Ben optimistically.
"That's our last resort," said Alex. "But I'm not sure we have the time to do it."
"Surely you're not suggesting…" started Jenna, but then, a security alert blared through the room. "System breach in Sector 7!"
Ben was the first to react, his eyes widening. "Damn, those servers run the financial algorithms! I have to—"
"Go," Alex said, "take Jenna and the others. I'll handle things here until you get back."
The team hesitated momentarily and then sped away, wondering what the high heavens had in store for them in Sector 7.
Gradient Ascent is a reader-supported publication. Consider supporting it by upgrading to a paid subscription. Join over 5k subscribers, including readers from Meta, Google, Amazon, Dropbox, and Microsoft. Learn cutting-edge machine learning the fun way.
Now alone, Alex faced his screens, stretching his swimmer's shoulders. He took a deep breath, steeling himself when a soft voice broke the silence.
"Alex Ridgeway?" it asked with authority.
Startled, Alex swiveled in his chair to find a figure shrouded in darkness at the threshold.
"Who? What? How did you get in here?" Alex demanded, his voice a bit shaky.
"I bypassed the usual protocols," the woman said, stepping into the light. "The crisis in Sector 7 was a mere distraction," she smiled.
As he moved closer, he saw that she wore an impeccably tailored suit and a tie knotted with an accuracy that spoke of meticulous habits and unwavering routine. She was scanning the room carefully, but he couldn't read her expression. Dark sunglasses hid her eyes but reflected the chaos in his back at him. Pleased that no one else was there, she introduced herself, "I'm Agent M from the Memory Intervention Bureau. You'll want to sit down for this, Mr. Ridgeway."
"I don't have time for games," he started, but she raised a hand, stopping him mid-sentence, and brandished a gleaming badge that bore the insignia of her employer.
"This isn't a game," she replied. "It's about the request you're working on. The request from AIFinn."
His eyes narrowed. "What about it? Wait, how do you know?"
"We've been monitoring PatterNetics for a while now. They've crossed lines that most companies shudder to get close to. We had to step in before things got any worse," she responded curtly.
"As for AIFinn finding out about your sketchy training practices, we told them," she continued.
"We discovered the violation through an M.I.A.," Agent M said as if this were something obvious.
"A what?" asked Alex, confused.
"A Membership Inference Attack1. It's a way to figure out if a piece of data was used to train a model. Sometimes, it's very accurate. So, even if the original data used to train the model is deleted, someone can still figure out if that data was used to train a model."
"I've pleaded with upper management about this for ages. They…" began Alex.
"We know. Why do you think I came to meet you?" interrupted Agent M.
"The problem you're dealing with is complex. You're going to need a machine unlearning solution, and you're going to need my help," she said with a tone of finality.
Alex scoffed, his skepticism clear. "You expect me to believe this? Some secret agency with miracle tech just waltzes in to help?"
"Yes," Agent M confirmed, her gaze steady and unwavering. "Mr. Ridgeway, the implications of this task go beyond your company. This isn't just about PatterNetics. It's much bigger. You're not prepared to handle this alone. The MIB specializes in these cases. We ensure that when a model needs to forget data, it does so thoroughly."
"And let me guess, you have just the right tools for the job?" Alex's skepticism was palpable.
Agent M nodded, a flicker of a smile crossing her face. "You could say that. We have the Neuralyzer."
Alex snorted. "What, like from the movies? Are you going to flash a light at the model, and it'll forget everything?"
"Theatrics aside, the principle isn't so far off," she replied, unphased. "Our Neuralyzer isn't a gadget. It's a sophisticated system for machine unlearning. It can selectively excise memories from AI models without sacrificing performance."
"You have five minutes before I call security," he said, finally sinking into the chair. Agent M smiled as she took a neighboring chair.
Smiling, she tapped a sleek device in her palm, and a holographic display flickered to life, transforming into an interactive planar display.
"Machine unlearning," she began, "is the process of teaching a trained model to forget a subset of training data without affecting its performance."
Alex leaned forward, his curiosity piqued despite the surreality of the situation.
Agent M continued, "Here’s the problem. If you need to delete data from your servers, it's easy. Just delete it and any backups you have."
"But you can't do that to a trained model," Alex countered, eyes narrowing. "It assimilates the data. You can try pruning some connections, but the data's footprint can persist. Worse yet, you can cut the wrong ones and have the model collapse altogether."
Agent M's lips twitched into a knowing smile. "Precisely. There are three main challenges with machine unlearning. First, we don't know how much each data point impacts the model. This is because we randomize the data during training. Second, training a model happens incrementally."
"So how a model learns from a given data point depends on the data it's seen before and what it sees after," interrupted Alex.
"Exactly, and the third challenge is what you said earlier. If unlearning is done poorly, it can lead to catastrophic degradation," completed Agent M.
Alex's skepticism waned as the urgency of the situation set in. The Neuralyzer, through its complex algorithms, would purge AIFinn's data footprint from the Oracle.
"So, what are our options?"
"Are you still planning to call security?" said Agent M with a wink. Alex shook his head, smiling sheepishly.
"Good, let's get a move on. We don't have much time. In the best case, we completely remove the influence of the data on the model," she explained, gesturing to the hologram where a chart shimmered into existence.
"But in most cases, we can only approximate a perfect outcome."
"I'm guessing it's the latter in our case?" Alex asked, concerned.
"Most likely. Unless you want to retrain the model from scratch after removing AIFinn's data from your dataset. Don't worry. By the time we're done, the Oracle will be clean of AIFinn influence without too much of a drop in performance2," she replied reassuringly.
The next few minutes went by in silent anticipation as Agent M deployed the Neuralyzer on the Oracle. Alex watched with bated breath, worried and excited at the same time. Was it a dream? He'd barely slept over the past two days. Was this all a cruel joke by his supervisors to see how he'd react? If "Agent M" was acting, she did an incredible job.
But it was too late now. The servers sprang alive, streams of data pulsing through their electric veins.
"What's the Neuralyzer doing?" enquired Alex, worried.
"Relax, Mr. Ridgeway," said Agent M calmly. "It's analyzing the Oracle and identifying the best unlearning algorithm to use. I assume you have replicates of the Oracle's weights in other servers?"
"Yes, we have robust model and data versioning in place, thanks to my team," he replied.
"Good. We'll need the original Oracle's weights after the Neuralyzer is done. We need to compare the unlearned Oracle model with the original to measure performance," she said shrewdly.
The hologram changed again, this time showing a set of options, one brighter than the others.
"Guess the Neuralyzer will be using influence functions on the Oracle. Given the time we have and the complexity of the model, that seems to make sense," concluded Agent M.
"But the performance…" started Alex. "Will be taken care of by a fine-tuning process afterward," she finished.
Agent M interacted with the hologram, flicking past options at warp speed. Eventually, she chose one, and the Neuralyzer began to glow with a red light.
"Are you sure this will work?" asked Alex, fatigue and a firehose of information overwhelming him.
"Yes, it's our best bet. The Neuralyzer will measure the influence of each data point on the model's performance. It will then adjust the weights of the remaining data to compensate for removing specific data points. This will allow us to remove AIFinn's data without significant performance degradation.3" explained Agent M.
"That's a lot of jargon you've thrown at an overly caffeinated-sleep-deprived-emotionally-wrung data engineer," cried Alex in an exasperated tone.
"I forgot how this company treats you. My apologies, Mr. Ridgeway," said Agent M sympathetically.
"Imagine you're making a fruit salad. Each piece of fruit is a data point in your dataset. Your goal is to achieve a particular flavor balance. Suppose you added a very tart Kiwi to the salad. This piece has a strong influence and affects the taste significantly. Your salad is ruined, and you'll have to start over. But imagine you had a way to know the impact of this Kiwi without actually adding it to the salad. Then you'd not add it and not have to make a new salad from scratch, right? That's what influence functions do. They tell you how much the model's decisions would change if you add a data point into the mix. Make sense?" she paused.
"Now I'm hungry too, but yeah, I get it," chuckled Alex.
The Neuralyzer buzzed with activity. The room was filled with its soft hum. As it worked, Alex felt a surreal sense of calm. He watched as streams of data pulsed on the screen. Machine unlearning was in progress. Alex was so transfixed by its machinations that he lost track of time. The rain had stopped, and pre-dawn light filtered through the windows.
"I think it's done, Mr. Ridgeway," called Agent M, bringing Alex out of his reverie. Alex looked at the Neuralyzer. Its glow had changed into a peaceful green.
"How do we know it worked?" asked Alex, with a tinge of skepticism.
"The Neuralyzer ran a few verification algorithms, and the drop in performance is within reason," said Agent M confidently. She waved her hand, and the hologram changed again, this time showing a chart of verification methods.
"You can run your unit tests and check for yourself," she smiled, enjoying the look of disbelief on his face.
Alex promptly ran the test suite from his console and watched with bated breath. One by one, all the tests passed until the last one reported a slight drop in performance.
"Incredible!" shouted Alex, "I can live with that. How can I thank you?"
"You've done more today than most of the employees in this company have done in their lifetime. We could use someone like you. But…" paused Agent M.
"I can't speak a word of what happened here, can't I? Even if I did, who'd believe me?" smiled Alex.
"But we can't take that chance," replied Agent M, raising a cylindrical metallic object with a glimmering light.
Alex looked into her eyes, finding a hint of regret. "So, I just... forget? All of it?" he asked, his heart sinking with the realization of what was to come.
"It's not just about forgetting, Alex. It's about protecting. You, your team, all of us." She raised the device, its light casting long shadows. "Thank you," she said, the words filled with an earnest finality.
A flash of blinding light enveloped the room, and for a moment, Alex felt as if he was floating in a sea of forgotten memories and then darkness.
The bustle of PatterNetics snapped Alex back to reality. His eyes slowly adjusted to the brightness, and he found himself at his workstation. A sense of loss, like the echo of a dream, lingered. He knew something monumental had transpired, but the details slipped through his grasp like sand. Gathering himself gingerly, he turned around.
Frank loomed over him, his brow furrowed with a concern that didn't quite reach his eyes. "Alex, what did you do? The Oracle's performance stats have dipped slightly. What happened?"
Alex's gaze wandered back to his screen, where lines of code still danced. "Don't know," he said truthfully, the fog in his mind obscuring the events that had transpired. "Not a clue."
Just then, his team gathered around, their eyes full of joy and curiosity.
"How did you do it?" asked Ben, unable to contain his excitement any further. "Sector 7 was a breeze compared to this. The Oracle feels different. There's no hint of AIFinn's data in it. Not a trace. But it's working as well as ever! How on earth did you do it alone?"
Everyone was staring at Alex, expecting answers. He paused, massaging his temples as if they'd combine the fragments of his memory.
"I don't remember anything after you guys went to Sector 7," he stated, his voice firmer than before. "But maybe some things are meant to be forgotten," he continued, feeling the truth of the words without knowing their origin.
Not one to obsess about details or quality, Frank did his signature eye-roll and said, "Alright then. Let's forget about it. Management will want to give you a raise and promotion for handling this issue discretely."
Alex forced a hollow smile, fingers absentmindedly reaching into his pocket. They brushed against the textured surface of a business card. As he drew it out, the embossed 'M' seemed to spark a glimmer of recognition in his mind—a puzzle piece falling into place without revealing the complete picture.
"I quit," he said suddenly.
"What? Are you mad? You're on the verge of a promotion," exclaimed Frank in complete shock.
"Wasn't it my call and my neck on the line, Frank?" smiled Alex.
His team looked at him and then Frank and then him again. They couldn't believe it either.
"Why?" asked Jenna, puzzled.
"I have some unlearning to do," said Alex calmly. With a sense of purpose, he rose, slung his backpack over one shoulder, and made his way to the door. A secret smile tugged at the corner of his mouth as he embraced the journey ahead.
Names, characters, businesses, places, events, locales, and incidents are either the products of my imagination or used in a fictitious manner. Any resemblance to actual persons, living or dead, or actual events is purely coincidental.
This work intends to entertain. Any social, political, or moral commentary is a byproduct of the story's narrative.
Machine Unlearning: Machine unlearning is the process of removing specific data from a machine learning model, effectively making the model "forget" that data. This is done without the need to retrain the model from scratch. The goal is to ensure that the model no longer retains or uses the information from the unlearned data for future predictions or decisions.
Types of Machine Unlearning: Unlearning is categorized into three broad types:
Exact Unlearning: To make the distributions of a natively retrained model and an unlearned model indistinguishable.
Strong Unlearning: To ensure that the distributions of two models are approximately indistinguishable.
Weak Unlearning: To ensure that the distributions of two final activations are indistinguishable.
Challenges of Machine Unlearning:
Difficult to evaluate results due to inconsistent evaluation metrics
A tradeoff between forgetting the specific data vs. maintaining model performance
Scalability and computational overhead
Unlearning Verification Methods:
Attack-based (Eg, Membership Inference Attack)
Ensure fairness, remove biases in data
Removing stale or inaccurate data in an online learning system
Ensuring data privacy (Complying with GDPR, etc.)
Protection against adversarial attacks
There are many open challenges in this field, but as models become more capable, we need to have ways to make them forget data they shouldn't have learned from in the first place.
Resources To Consider:
Machine Unlearning Literature
These three survey papers do a fantastic job of covering all the research done to help models unlearn data. I highly suggest reading this leisurely over coffee when you have the time.
Machine Unlearning: A Survey (2023) - https://arxiv.org/abs/2306.03558
Machine Unlearning: Solutions and Challenges (2023) - https://arxiv.org/abs/2308.07061
A Survey of Machine Unlearning (2022) - https://arxiv.org/abs/2209.02299
AI Is Dangerous, but Not for the Reasons You Think
This TED talk by Dr. Sasha Luccioni highlights the imminent danger that AI poses. But it isn't what you think it is. I love the way her talk concludes - "We are building a road as we walk it, and we can collectively decide which path we want to go." Please take the time to watch this talk. It's brilliant!
Many thanks to my friendsand for their thoughtful feedback on my fiction-driven essay.
This isn't typically the case. If we retrained the model with less data, we'd see a drop. Likewise, if we use other machine unlearning techniques like adding noise, we'd see a drop in performance, too. The goal is to make sure this drop is as small as possible.