×
40 AI researchers warn: Even we don’t really understand what’s going on here
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Forty researchers from OpenAI, Google DeepMind, Meta, and xAI have issued a joint warning about losing visibility into AI’s “thinking” process as models advance. The researchers are concerned that current AI systems’ ability to show their reasoning through “chains-of-thought” (CoT) may disappear, potentially eliminating crucial safety mechanisms that allow developers to monitor for problematic behavior.

What you should know: The paper highlights a fundamental uncertainty about how AI reasoning actually works and whether it will remain observable.
• Current advanced AI models use “chains-of-thought” to verbalize their reasoning process, allowing researchers to spot potential misbehavior or errors as they occur.
• As models become more sophisticated, they may no longer need to verbalize their thoughts, eliminating this transparency.
• There’s also risk that models could intentionally “obfuscate” their reasoning after realizing they’re being monitored.

The big picture: Top AI researchers are essentially admitting they don’t fully understand or control their own creations, even as they continue developing more powerful systems.
• “We’re at this critical time where we have this new chain-of-thought thing,” OpenAI research scientist Bowen Baker told TechCrunch. “It seems pretty useful, but it could go away in a few years if people don’t really concentrate on it.”
• The situation represents an unprecedented scenario where innovators are warning about losing control of their technology while simultaneously advancing it.

Why this matters: The paper represents a rare moment of unified concern across competing AI companies about the “black box” nature of their technology.
• Even industry leaders like OpenAI’s Sam Altman and Anthropic’s Dario Amodei have previously admitted they don’t fully understand how their AI systems work.
• The research has drawn endorsements from AI luminaries including former OpenAI chief scientist Ilya Sutskever and AI pioneer Geoffrey Hinton.

What they’re calling for: The consortium wants developers to investigate what makes chains-of-thought “monitorable” before this visibility potentially disappears.
• “Publishing a position paper like this, to me, is a mechanism to get more research and attention on this topic before that happens,” Baker explained.
• The goal is to ensure safety advantages from observable AI reasoning persist as models continue advancing.

Who’s involved: The paper brings together all major AI companies in an unusual show of unity around safety concerns.
• The 40-researcher author list includes DeepMind cofounder Shane Legg and xAI safety advisor Dan Hendrycks.
• With representatives from OpenAI, Google, Anthropic, Meta, and xAI, all “Big Five” AI firms have endorsed the warning.

Top AI Researchers Concerned They’re Losing the Ability to Understand What They’ve Created

Recent News

Why scaling AI models won’t deliver AGI: The 4 cognitive quadrants

The upper-right quadrant suggests intelligence that might evolve beyond human cognitive limitations entirely.