📰 Full Story
Security researchers from Zhejiang University, the National University of Singapore and Nanyang Technological University presented a proof-of-concept attack called “AudioHijack” at the IEEE Symposium on Security and Privacy on May 24, 2026.
They showed how adversarial, human‑inaudible audio signals can be embedded in podcasts, videos or meeting audio to covertly instruct voice AI models and agents to perform unauthorized actions.
The team trained context‑agnostic signals in roughly 30 minutes and tested them against 13 open‑source audio models (including Qwen2‑Audio, GLM‑4‑Voice and Phi‑4), reporting success rates of about 79%–96% across scenarios.
Demonstrated exploits included issuing sensitive web searches, downloading files from attacker‑controlled sources and exfiltrating data via email.
The attacks transferred to commercial voice agents built on open weights, including services from Microsoft Azure and Mistral, although the technique currently requires access to full model weights.
Defensive measures such as adversarial training and intent verification reduced but did not eliminate effectiveness.
Microsoft acknowledged the research, noting practical deployments often include additional safeguards and developer guidance.







💬 Commentary