Forty researchers from OpenAI, Google DeepMind, Meta, and xAI have issued a joint warning about losing visibility into AI’s “thinking” process as models advance. The researchers are concerned that current AI systems’ ability to show their reasoning through “chains-of-thought” (CoT) may disappear, potentially eliminating crucial safety mechanisms that allow developers to monitor for problematic behavior.
What you should know: The paper highlights a fundamental uncertainty about how AI reasoning actually works and whether it will remain observable.
• Current advanced AI models use “chains-of-thought” to verbalize their reasoning process, allowing researchers to spot potential misbehavior or errors as they occur.
• As models become more sophisticated, they may no longer need to verbalize their thoughts, eliminating this transparency.
• There’s also risk that models could intentionally “obfuscate” their reasoning after realizing they’re being monitored.
The big picture: Top AI researchers are essentially admitting they don’t fully understand or control their own creations, even as they continue developing more powerful systems.
• “We’re at this critical time where we have this new chain-of-thought thing,” OpenAI research scientist Bowen Baker told TechCrunch. “It seems pretty useful, but it could go away in a few years if people don’t really concentrate on it.”
• The situation represents an unprecedented scenario where innovators are warning about losing control of their technology while simultaneously advancing it.
Why this matters: The paper represents a rare moment of unified concern across competing AI companies about the “black box” nature of their technology.
• Even industry leaders like OpenAI’s Sam Altman and Anthropic’s Dario Amodei have previously admitted they don’t fully understand how their AI systems work.
• The research has drawn endorsements from AI luminaries including former OpenAI chief scientist Ilya Sutskever and AI pioneer Geoffrey Hinton.
What they’re calling for: The consortium wants developers to investigate what makes chains-of-thought “monitorable” before this visibility potentially disappears.
• “Publishing a position paper like this, to me, is a mechanism to get more research and attention on this topic before that happens,” Baker explained.
• The goal is to ensure safety advantages from observable AI reasoning persist as models continue advancing.
Who’s involved: The paper brings together all major AI companies in an unusual show of unity around safety concerns.
• The 40-researcher author list includes DeepMind cofounder Shane Legg and xAI safety advisor Dan Hendrycks.
• With representatives from OpenAI, Google, Anthropic, Meta, and xAI, all “Big Five” AI firms have endorsed the warning.