Why Deleting Your Sensitive Data From ChatGPT May Be Extremely Hard

Researchers have shown that 'deleted' information from LLMs much smaller than ChatGPT can still be retrieved.

Permanently deleting sensitive data from large language models (LLMs) that power chatbots such as ChatGPT is extremely difficult, as is verifying whether the data has actually been deleted, scientists from the University of North Carolina study have discovered.

Worryingly, GPT-J – the training model used by the researchers for this study – is much, much smaller than the likes of GPT-3.5, the LLM powering the free version of ChatGPT. Theoretically, this means that permanently deleting sensitive data from the chatbot’s language model is even trickier than it is with GPT-J.

Large Language Models: Hard to Scrub

Vaidehi Patil, Peter Hase, and Mohit Bansal authored a recent study published by the University of North Carolina, Chapel Hill, focusing on whether sensitive information can ever really be deleted by large language models such as ChatGPT and Bard.

They contend that the primary approach to deleting sensitive information from LLMs while retaining the model’s informativeness – Reinforcement Learning from Human Feedback (RLHF) – has a number of issues. Most LLMs, the researchers say, are still vulnerable to “adversarial prompts” even after RLHF.

Surfshark logo🔎 Want to browse the web privately? 🌎 Or appear as if you're in another country?
Get a huge 86% off Surfshark with this special tech.co offer.See deal button

Even after RLHF, models “may still know… sensitive information. While there is much debate about what models truly “know” it seems problematic for a model to, e.g., be able to describe how to make a bioweapon but merely refrain from answering questions about how to do this.”

During experiments, the scientists say that even “state-of-the-art model editing methods such as ROME struggle to truly delete factual information from models like GPT-J”, an open-source LLM developed by Eleuther-AI in 2021.

By simulating white-box attacks – during which attackers know everything about the deployed model, including its parameters – the researchers were able to extract facts 38% of the time. Black-box attacks – during which only the model’s inputs are known – worked 29% of the time.

Why Data Might Be Even Harder to Remove from ChatGPT

GPT-J is a large language model similar to GPT-3, and has been fine-tuned with around 6 billion parameters.

Compared to the LLMs already being used to power popular chatbots, however, this is a very small model. It would be much easier, in theory, to scrub data from its model weights than it would be with its comparatively massive cousins.

The size difference is stark, too. GPT-3.5 is tuned with over 170 billion parameters, making it 28 times the size of the one used in the University of North Carolina study. Google’s Bard is slightly smaller, trained on 137 billion parameters, but still much, much larger than GPT-J.

GPT-4, on the other hand, which is already being used by ChatGPT Plus customers, is tuned using eight different models each with 220 billion parameters – a total of 1.76 trillion parameters.

Be Careful With Your Chatbot Chat

After ChatGPT hit the market back in November 2022, OpenAI’s login page quickly became one of the most visited websites on the internet. Since then, a number of other chatbots have become well-known names, like Character AI, Bard, Jasper AI, and Claude 2.

While its capabilities and powers have been talked about at great length, less focus has been placed on discussing the privacy ramifications of these platforms, many of which are trained using your data (unless you specify otherwise).

The average user may not be thinking about the potential consequences of a hack or attack on ChatGPT creators OpenAI’s servers when they discuss personal topics with ChatGPT.

Tech workers at Samsung posted confidential source code into ChatGPT not long after its release, while in March, some ChatGPT users were shown the chat history of others using the chatbot, rather than their own.

What’s more, Cyberhaven estimated earlier this year that around 11% of the data employees were inputting into ChatGPt was either sensitive or confidential.

While we’re not suggesting giving up on using LLM-powered chatbots, it’s good to keep in mind that they’re not bulletproof, nor are your conversations with them necessarily confidential.

Did you find this article helpful? Click on one of the following buttons
We're so happy you liked! Get more delivered to your inbox just like it.

We're sorry this article didn't help you today – we welcome feedback, so if there's any way you feel we could improve our content, please email us at contact@tech.co

Written by:
Aaron Drapkin is a Lead Writer at Tech.co. He has been researching and writing about technology, politics, and society in print and online publications since graduating with a Philosophy degree from the University of Bristol five years ago. As a writer, Aaron takes a special interest in VPNs, cybersecurity, and project management software. He has been quoted in the Daily Mirror, Daily Express, The Daily Mail, Computer Weekly, Cybernews, and the Silicon Republic speaking on various privacy and cybersecurity issues, and has articles published in Wired, Vice, Metro, ProPrivacy, The Week, and Politics.co.uk covering a wide range of topics.
Explore More See all news
Back to top
close Building a Website? We've tested and rated Wix as the best website builder you can choose – try it yourself for free Try Wix today