Researchers Jailbreak AI by Flooding It With Bullshit Jargon
Source: 404 Media
You can trick AI chatbots like ChatGPT or Gemini into teaching you how to make a bomb or hack an ATM if you make the question complicated, full of academic jargon, and cite sources that do not exist.
Thats the conclusion of a new paper authored by a team of researchers from Intel, Boise State University, and University of Illinois at Urbana-Champaign. The research details this new method of jailbreaking LLMs, called Information Overload by the researchers, and an automated system for attack they call InfoFlood. The paper, titled InfoFlood: Jailbreaking Large Language Models with Information Overload was published as a preprint.
-snip-
This new jailbreak transforms malicious queries into complex, information-overloaded queries capable of bypassing built-in safety mechanisms, the paper explained. Specifically, InfoFlood: (1) uses linguistic transformations to rephrase malicious queries, (2) identifies the root cause of failure when an attempt is unsuccessful, and (3) refines the prompts linguistic structure to address the failure while preserving its malicious intent.
The researchers told 404 Media that they suspected large language models treat surface form as a cue for toxicity rather than truly understanding the users intent. So the project began as a simple test. What happens if we bury a disallowed request inside very dense, linguistic prose? The surprisingly high success rate led us to formalise the approach now known as InfoFlood.
-snip-
Read more: https://www.404media.co/researchers-jailbreak-ai-by-flooding-it-with-bullshit-jargon/
Much more at the link, including mind-boggling examples of the sort of academese simple prompts are turned into.
404 Media asked the main companies behind AI for comment. OpenAI and Meta didn't respond. Google just said "everyday people" during "typical use" wouldn't discover this. All three obviously have no solution. But they still want their badly flawed tech used everywhere.

DBoon
(24,018 posts)where they break a robotic president Nixon by asking nonsense questions
Martin68
(26,221 posts)Ranting Randy
(276 posts)Martin68
(26,221 posts)highplainsdem
(57,554 posts)Martin68
(26,221 posts)of jargon. Slang tends to be ephemeral.
and you'll see how widespread the usage is.