Latest Breaking News

highplainsdem

(57,554 posts) Tue Jul 8, 2025, 10:54 AM Jul 8

Researchers Jailbreak AI by Flooding It With Bullshit Jargon

Source: 404 Media

You can trick AI chatbots like ChatGPT or Gemini into teaching you how to make a bomb or hack an ATM if you make the question complicated, full of academic jargon, and cite sources that do not exist.

That’s the conclusion of a new paper authored by a team of researchers from Intel, Boise State University, and University of Illinois at Urbana-Champaign. The research details this new method of jailbreaking LLMs, called “Information Overload” by the researchers, and an automated system for attack they call “InfoFlood.” The paper, titled “InfoFlood: Jailbreaking Large Language Models with Information Overload” was published as a preprint.

-snip-

This new jailbreak “transforms malicious queries into complex, information-overloaded queries capable of bypassing built-in safety mechanisms,” the paper explained. “Specifically, InfoFlood: (1) uses linguistic transformations to rephrase malicious queries, (2) identifies the root cause of failure when an attempt is unsuccessful, and (3) refines the prompt’s linguistic structure to address the failure while preserving its malicious intent.”

The researchers told 404 Media that they suspected large language models “treat surface form as a cue for toxicity rather than truly understanding the user’s intent.” So the project began as a simple test. “What happens if we bury a disallowed request inside very dense, linguistic prose? The surprisingly high success rate led us to formalise the approach now known as InfoFlood.”

-snip-

Read more: https://www.404media.co/researchers-jailbreak-ai-by-flooding-it-with-bullshit-jargon/

Much more at the link, including mind-boggling examples of the sort of academese simple prompts are turned into.

404 Media asked the main companies behind AI for comment. OpenAI and Meta didn't respond. Google just said "everyday people" during "typical use" wouldn't discover this. All three obviously have no solution. But they still want their badly flawed tech used everywhere.