Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

highplainsdem

(63,221 posts)
Tue May 26, 2026, 04:48 PM 8 hrs ago

AI Just Isn't Right (Wired, 5/26/26 - a human fact-checker FTW over AI)

https://www.wired.com/story/fact-checking-ai/

-snip-

In any article that comes across WIRED’s fact-checking desk, there’s usually a decent amount of “b-matter”: statistics, news events, quotes, anything that helps contextualize the topic. Fact-checkers tend to Google this basic information, and that process, in the form of the search engine’s dreaded AI Overviews, constitutes my main interaction with AI. In my professional opinion, it’s unusable—wrong—about a third of the time.

This might be a generous assessment, though. A March 2025 study from the Tow Center for Digital Journalism found that more than 60 percent of responses from AI-powered search engines were inaccurate. A BBC study puts the wrongness of chatbots closer to 45 percent, the number I see cited more often. Because percentages are distancing, let me put this more plainly: AI could be wrong about half the time.

Does it matter which model? Elon Musk has said Grok is the smartest, but I haven’t seen much research that agrees. Claude led the pack in RealFactBench, a fact-checking-focused benchmark test developed by computer scientists in China and the UK last year. It scored 73 percent accuracy across all metrics. (To be fair, Grok was not assessed.) Another benchmark, SimpleQA, developed by OpenAI in October 2024, posed more than 4,000 single-answer questions to models from OpenAI and Anthropic. None of the models exceeded 50 percent accuracy. Google updated the benchmark earlier this year, winnowing the question set to 1,000. Gemini 2.5 Pro came out on top, with 55.6 percent accuracy.

Then there’s the models’ own assessments. When I asked ChatGPT how accurate the major LLMs are, it told me that most models had 90 to 96 percent accuracy on some professional-style tests. It then offered a link, confusingly, to a paper on a sleep medicine certification exam. On “general real-world questions,” it simply offered me the rate at which models like it have been shown to hallucinate: 1 to 2 percent, apparently, though when I tried to click through to that referenced source, it didn’t exist.

-snip-
8 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
AI Just Isn't Right (Wired, 5/26/26 - a human fact-checker FTW over AI) (Original Post) highplainsdem 8 hrs ago OP
K&R'D snot 8 hrs ago #1
Wired does some stellar reporting. yellow dahlia 6 hrs ago #2
Including on political issues, despite whining from rightwing readers who want the magazine's editors highplainsdem 2 hrs ago #5
They report on truth. Truth is "left" leaning. yellow dahlia 2 hrs ago #8
Techbros: "THAT'S WHY WE NEED MORE DATA CENTERS!!!!!!!" durablend 6 hrs ago #3
Yes, with more stolen data, and then their flawed tech with FINALLY work. highplainsdem 2 hrs ago #6
AI, who are The Beatles... lame54 6 hrs ago #4
I would not be surprised if chatbots have sometimes gotten their names wrong. highplainsdem 2 hrs ago #7

highplainsdem

(63,221 posts)
5. Including on political issues, despite whining from rightwing readers who want the magazine's editors
Tue May 26, 2026, 10:47 PM
2 hrs ago

and writers to stay out of politics.

yellow dahlia

(6,546 posts)
8. They report on truth. Truth is "left" leaning.
Tue May 26, 2026, 11:00 PM
2 hrs ago

Truth has a liberal "bias".

Our reality has a liberal "bias".

Latest Discussions»General Discussion»AI Just Isn't Right (Wire...