Microsoft has disclosed details of a novel side-channel attack targeting remote language models that could enable a passive adversary with capabilities to inspect network traffic to obtain details about modeled conversation topics despite encryption protections in some circumstances.
The company said this leakage of data exchanged between humans and streaming-mode language models could pose a serious risk to the privacy of user and enterprise communications. The attack has been codenamed Whisper Leak,
“Cyber attackers in a position to observe encrypted traffic (for example, a nation-state actor at the Internet service provider level, someone on a local network, or someone connected to the same Wi-Fi router) can use this cyberattack to infer whether a user is pointing to a specific topic,” security researchers Jonathan Bar Orr and Geoff MacDonald, with the Microsoft Defender Security Research Team, said.
Put differently, the attack allows an attacker to inspect encrypted TLS traffic between a user and an LLM service, extract packet sizes and time sequences, and use trained classifiers to predict whether the subject of the conversation matches a sensitive target category.
Model streaming in large language models (LLM) is a technique that allows incremental data reception as the model generates responses rather than waiting for the entire output to be computed. This is an important response mechanism because some responses can take time depending on the complexity of the signal or task.
The latest technology demonstrated by Microsoft is important, not only because it works despite the fact that communications with artificial intelligence (AI) chatbots are encrypted with HTTPS, which ensures that the content of the exchange remains secure and cannot be tampered with.
A number of side-channel attacks have been devised against LLM in recent years, including the ability to guess the length of individual plaintext tokens from the size of encrypted packets in streaming model responses or exploiting timing differences caused by caching of LLM estimates to carry out input theft (aka InputSnatch).
Whisper Leaks builds on these findings to explore the possibility that “the sequence of encrypted packet sizes and inter-arrival times during a streaming language model response contains enough information to classify the subject of the initial signal, even in cases where responses are streamed in groups of tokens,” according to Microsoft.
To test this hypothesis, the Windows maker said it trained a binary classifier as a proof of concept that is able to distinguish between a specific subject signal and the rest (i.e., noise) using three different machine learning models: LightGBM, BI-LSTM, and BERT.
The result is that several models from Mistral,
“If a government agency or Internet service provider were monitoring traffic to a popular AI chatbot, they could reliably identify users asking questions about specific sensitive topics — be it money laundering, political dissent, or other monitored topics — even if all traffic was encrypted,” Microsoft said.
| whisper leak attack pipeline |
To make matters worse, researchers found that the effectiveness of whisper leaks can improve as an attacker collects more training samples over time, turning it into a practical threat. Following responsible disclosure, OpenAI, Mistral, Microsoft and XAI have taken measures to counter the risk.
“With more sophisticated attack models and the richer patterns available in multi-turn conversations or multiple conversations with the same user, this means that a cyber attacker with patience and resources can achieve a higher success rate than our initial results suggest,” it added.
An effective countermeasure devised by OpenAI, Microsoft, and Mistral involves adding a “random sequence of variable length text” to each response, which, in turn, masks the length of each token to render side-channel mute.
Microsoft is also recommending that users concerned about their privacy when talking to AI providers can avoid discussing highly sensitive topics when using untrusted networks, use a VPN for an extra layer of security, use the non-streaming model of AI, and switch to providers that implement mitigations.
This was revealed in the new evaluation of eight open-weighted LLMs from Alibaba (Qwen3-32b), DeepSeq (v3.1), Google (Gemma 3-1b-it), Meta (Llama 3.3-70b-instruct), Microsoft (phi-4), Mistral (large-2 aka large-instruct-2047), OpenAI (gpt-oss-20b), and Zipu AI (glm). Has come in the form. 4.5-air) have found them to be highly vulnerable to adversary manipulation, especially when it comes to multi-turn attacks.
| Comparative vulnerability analysis showing attack success rates across tested models for both single-turn and multi-turn scenarios |
“These results underscore the systemic inability of existing open-weight models to maintain security guardrails during extended interactions,” Cisco AI Defense researchers Amy Chang, Nicholas Connelly, Harish Santhanalakshmi Ganesan, and Adam Swanda said in an accompanying paper.
“We assess that alignment strategies and laboratory priorities significantly impact resiliency: capacity-focused models like Llama 3.3 and Quen 3 demonstrate higher multi-turn sensitivity, while security-oriented designs like Google Gemma 3 demonstrate more balanced performance.”
These findings, which show that organizations adopting open-source models may face operational risks in the absence of additional security guardrails, add to the growing body of research highlighting fundamental security vulnerabilities in LLM and AI chatbots since the public launch of OpenAI ChatGPT in November 2022.
This makes it critical that developers implement adequate security controls when integrating such capabilities into their workflow, make open-source models more robust to jailbreaks and other attacks, conduct periodic AI red-teaming assessments, and implement strict system signals that align with defined use cases.