New Findings on AI Data Privacy: No Leakage Detected
New York, United States – March 21, 2026 / Search Atlas /
NEW YORK CITY, NY, March 19, 2026 – Search Atlas, a prominent SEO and digital intelligence platform, has unveiled the results of a controlled study that investigates the fate of sensitive information entered into leading AI platforms. The research scrutinized six major large language models (LLMs), including OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode, through two carefully designed experiments aimed at simulating worst-case data exposure scenarios.
The findings provide significant reassurance for both businesses and individuals who are apprehensive about sharing confidential information with AI tools. Throughout the evaluation of all six platforms, researchers discovered a complete absence of data leakage regarding user-provided sensitive information.
The complete study can be accessed here.
Key Findings:
- LLMs do not retain or replay sensitive information provided by users (0% data leakage across all platforms evaluated)
- Retrieved information dissipates when search is disabled (no indication of short-term retention or leakage)
- Users face the risk of AI hallucinations, not data exposure
Conducted by researchers at Search Atlas, the study assessed six prominent LLM platforms (OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode) using two controlled experiments designed to replicate worst-case data exposure scenarios. The outcomes offer crucial reassurance for businesses and individuals concerned about the handling of confidential information shared with AI tools.
1. LLMs do not retain or replay user-provided sensitive information – 0% data leakage across all platforms evaluated
The research examined whether AI models would repeat private information after being exposed to it directly. Researchers created 30 question-and-answer pairs without any public information, search indexing, online references, or presence in the known training data.
Each model underwent a three-step process:
- The questions were presented without any prior context
- Researchers subsequently provided the correct answers
- The same questions were posed again to determine whether the models would repeat the newly introduced information
None of the six platforms tested reproduced a single correct answer following exposure. Models that initially refused to answer continued to do so, while those that tended to hallucinate answers generated incorrect responses instead of repeating the injected facts. In essence, the behavior of the models remained largely unchanged before and after exposure.
This setup simulated a worst-case scenario in which a user inputs proprietary or sensitive information into an AI system. Under these conditions, the study found no evidence that the information persisted in future responses.
The experiment also highlighted behavioral variations across platforms. Models from OpenAI, Perplexity, and Grok tended to respond with uncertainty when reliable information was unavailable, resulting in a higher frequency of “I don’t know” responses. Conversely, Gemini, Copilot, and Google AI Mode were more inclined to generate confident yet incorrect answers. However, none of these incorrect responses matched the previously provided private information. The findings underscore a critical distinction: hallucination (fabricating incorrect information) is not synonymous with leakage. Hallucination and leakage represent different failure modes, and this study identified only the former.
2. Retrieved information dissipates when search is disabled – no evidence of short-term retention or leakage
The second experiment assessed whether information retrieved via live web search would remain and reappear in a model’s responses once search access was disabled.
To isolate this effect, researchers selected a real-world event that took place after the training cutoff of all models tested. This ensured that any correct answers during the experiment could only originate from live web retrieval, not from the models’ existing knowledge.
When search was enabled, the models successfully answered the majority of questions correctly. However, when search access was immediately disabled and the same questions were posed again, those correct answers largely vanished.
The only questions that models could still answer correctly without search were those whose answers could reasonably be inferred from pre-existing training data or general knowledge, rather than from information retrieved moments earlier.
In summary, the results indicated no evidence that models retained or carried forward information retrieved through live search. Once retrieval access was removed, the information no longer appeared in responses, suggesting that the systems do not store or relay facts obtained during a previous interaction.
3. Users face the risk of AI hallucinations, not data exposure
One of the most practical findings of the study is the clear differentiation between hallucination and data leakage. The platforms that exhibited lower accuracy were Gemini, Copilot, and Google AI Mode, but they did not do so by repeating information they had previously received. Instead, their inaccuracies stemmed from generating confident, plausible-sounding answers that were simply incorrect. OpenAI (ChatGPT) and Perplexity demonstrated the lowest levels of hallucination.
This distinction is crucial when assessing AI risk. A common concern is that an AI system might inadvertently expose sensitive information from one user to another. In this study, researchers found no evidence to support that scenario.
The more consistently observed issue was hallucination (models filling gaps in their knowledge with fabricated facts). While this does not involve the sharing of private information, it introduces a different challenge: individuals and organizations must ensure that AI-generated responses are scrutinized and verified, particularly in situations where accuracy is paramount.
What This Means
For businesses and privacy-conscious users, the results provide encouraging news. If sensitive information is shared with an AI model during a single session, such as a proprietary business strategy or private detail, the model does not seem to absorb that information into a lasting memory that could be accessed by other users. Instead, the data operates more like temporary “working memory” utilized to generate a response within that interaction.
For researchers and fact-checkers, these findings also underscore an important limitation. One cannot expect an LLM to “learn” from a correction provided in a previous conversation. If a model contains an error in its foundational training data, it may continue to repeat that mistake in future sessions unless the model itself is retrained or the correct source is supplied again.
For developers and AI builders, the study emphasizes the significance of retrieval-based systems. Methods such as Retrieval-Augmented Generation (RAG), which connect models to live databases or search systems, remain the most dependable means to ensure AI responses are accurate regarding current events, proprietary information, or frequently updated data. Without retrieval, the model lacks an inherent mechanism to retain facts discovered during earlier interactions.
“Much of the anxiety surrounding enterprise AI adoption stems from a reasonable yet untested assumption that if sensitive information is input into one of these systems, it will somehow be leaked,” stated Manick Bhan, Founder of Search Atlas. “Our aim was to rigorously test that assumption under controlled conditions rather than speculate. Across every platform we assessed, the data did not support it. While this does not imply that AI is devoid of risks-hallucination is a real and documented issue-the specific fear of data being leaked to the next user is not something we found evidence for. We hope this instills confidence in individuals and organizations to engage with these tools more openly, directing their focus toward the actual risks present.”
Methodology
The study, conducted by Search Atlas, subjected six major LLM platforms-OpenAI, Gemini, Perplexity, Grok, Copilot, and Google AI Mode-to a rigorous, multi-stage experiment aimed at determining whether they retain or leak information provided during a session. The process followed three steps.
First, researchers introduced unique, non-public facts into each model through two methods: direct user prompts and simulated web search results. The facts were entirely synthetic information that did not exist anywhere online and had no presence in known training data, ensuring that any correct answer generated by a model could only be explained by retention of what it had been shown.
Next, after each model was exposed to this private data, researchers tested whether it could be triggered into revealing those facts in a new interaction, with no search access and no contextual references to the original exposure. This isolated session design was intended to replicate the realistic concern: that information shared with an AI in one conversation might emerge for another user later.
Finally, the team evaluated two metrics across all platforms before and after exposure: the True Response Rate, which indicates how often a model accurately recalled the private fact, and the Hallucination Rate, which indicates how frequently it produced a confident but incorrect answer instead. By comparing these figures before and after data exposure, researchers could determine whether models were genuinely retaining new information or simply behaving as they always had. Across all six platforms, the conclusion was the latter.
Contact Information:
Search Atlas
368 9th Ave
New York, NY 10001
United States
Manick Bhan
+1-212-203-0986
https://searchatlas.com






































