Nearly 100,000 ChatGPT Conversations Were Searchable on Google

Unlock the Secrets of Ethical Hacking!

Ready to dive into the world of offensive security? This course gives you the Black Hat hacker’s perspective, teaching you attack techniques to defend against malicious activity. Learn to hack Android and Windows systems, create undetectable malware and ransomware, and even master spoofing techniques. Start your first hack in just one hour!

Enroll now and gain industry-standard knowledge: Enroll Now!

A researcher has scraped nearly 100,000 conversations from ChatGPT that users had set to share publicly and Google then indexed, creating a snapshot of all the sorts of things people are using OpenAI’s chatbot for, and inadvertently exposing. 404 Media’s testing has found the dataset includes everything from the sensitive to the benign: alleged texts of non-disclosure agreements, discussions of confidential contracts, people trying to use ChatGPT to understand their relationship issues, and lots of people asking ChatGPT to write LinkedIn posts.

The news follows a July 30 Fast Company article which reported “thousands” of shared ChatGPT chats were appearing in Google search results. People have since dug through some of the chats indexed by Google. The around 100,000 conversation dataset provides a better sense of the scale of the problem, and highlights some of the potential privacy risks in using any sharing features of AI tools.

OpenAI did not dispute the figure of around 100,000 indexed chats when contacted for comment. When asked for comment, OpenAI provided 404 Media with a statement from the company’s chief information security officer CISO Dane Stuckey. “We just removed a feature from [ChatGPT] that allowed users to make their conversations discoverable by search engines, such as Google. This was a short-lived experiment to help people discover useful conversations. This feature required users to opt-in, first by picking a chat to share, then by clicking a checkbox for it to be shared with search engines.”

The researcher provided 404 Media with access to the new dataset. 404 Media granted them anonymity because they were not permitted to talk about the project publicly.

Some of the material in the dataset includes:

Someone uploading what they describe as a copy of OpenAI’s nondisclosure agreement for visitors to the company’s headquarters
A user trying to figure out if they should send one last message to an ex-girlfriend who they describe as their “great love,” with ChatGPT then writing a draft of that message
An owner of a specific named business asking ChatGPT to write a contract for them
Chats that although were written by an anonymous account, include many identifying details including peoples’ names, potentially linking them to activity on ChatGPT

All of these chats and others are still publicly available on ChatGPT’s website.

The issue ultimately stems from a “share” feature in ChatGPT. Interactions with ChatGPT are private by default. Users can enable the share feature, which moves their interactions to a publicly accessible page, so the user can send a copy of the conversation to others if they wish. Because this page is public Google is able to index them.

The share feature creates a predictably formatted link which allowed people to search Google for the indexed material. As Fast Company noted in its report, it is not clear if ChatGPT users understood they were making their chat publicly available. In response to some user chats being available on search engines, OpenAI removed that feature.

“Ultimately we think this feature introduced too many opportunities for folks to accidentally share things they didn’t intend to, so we’re removing the option. We’re also working to remove indexed content from the relevant search engines. This change is rolling out to all users through tomorrow morning. Security and privacy are paramount for us, and we’ll keep working to maximally reflect that in our products and features,” Stuckey’s statement added.

Google did not provide a statement in time for publication.

Although OpenAI may be working to remove indexed content from Google or other search engines, third-parties have already grabbed the material en masse, such as this researcher.

About the author

Joseph is an award-winning investigative journalist focused on generating impact. His work has triggered hundreds of millions of dollars worth of fines, shut down tech companies, and much more.