×
Former Google researcher predicts AI hallucinations fix within a year
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Former Google AI researcher Raza Habib predicts that AI hallucinations—when chatbots generate false or fabricated information—will be solved within a year, though he questions whether complete elimination is desirable. Speaking at Fortune’s Brainstorm AI conference in London, Habib argued that some degree of hallucination may be necessary for AI systems to generate truly novel ideas and creative solutions.

The technical solution: Habib explains that AI models are naturally well-calibrated before human preference training disrupts their accuracy assessment.

  • “If you look at the models before they are fine-tuned on human preferences, they’re surprisingly well calibrated,” Habib said, noting that a model’s confidence correlates well with truthfulness before human feedback training.
  • The challenge lies in preserving this natural calibration while making models more responsive to human preferences through reinforcement learning from human feedback.
  • Habib’s London-based startup Humanloop, which has raised $2.6 million, focuses on making large language model training more efficient.

In plain English: AI models go through three training stages—pre-training, fine-tuning, and reinforcement learning from human feedback. During the first stage, models naturally develop good judgment about when they’re right or wrong. However, the final stage of training, which makes AI more helpful and conversational, accidentally breaks this self-awareness. The solution involves preserving the model’s original confidence calibration while still making it user-friendly.

Why perfect accuracy might not be ideal: Habib argues that eliminating all hallucinations could limit AI’s creative potential.

  • “If we want to have models that will one day be able to create new knowledge for us, then we need them to be able to act as conjecture machines; we want them to propose things that are weird and novel,” he explained.
  • For creative tasks, having models “fabricate things that are going off the data domain is not necessarily a terrible thing,” according to Habib.
  • Current user experiences already accommodate imperfect technology, similar to how Google provides ranked search results rather than definitive answers.

Real-world consequences highlighted: The panel discussed Air Canada’s costly chatbot mistake as an example of preventable AI failures.

  • Customer Jake Moffatt was incorrectly told by Air Canada’s chatbot in 2022 that he could retroactively receive bereavement fare discounts after purchasing full-price tickets totaling over $1,200.
  • When Air Canada refused the refund, citing the chatbot’s error, Canada’s courts ordered the airline to compensate Moffatt.
  • “They gave the chatbot a much wider range than what it should have been able to say,” Habib said, calling the incident “completely avoidable” with proper testing and guardrails.

What the experts are saying: Industry leaders emphasized the importance of careful AI deployment in customer-facing applications.

  • “Just because something seems to work in a proof of concept, you probably don’t just want to put it straight into production, with real customers who have expectations and terms and conditions,” said Jeremy Barnes, ServiceNow’s VP of AI product.
  • Air Canada disputed the characterization, with a spokesperson telling Fortune that “the chatbot involved in the incident did not use AI” and “predated Generative AI capabilities.”
China will have 'many fans' in the global south of its approach to AI development: The Asia Group

Recent News

University of Illinois launches CropWizard AI for real-time farming guidance

Academic institutions are now driving practical AI applications into America's oldest industry.

Lawyer faces sanctions for using AI to fabricate 22 legal citations

Judge calls AI legal research "a game of telephone" requiring verification with original sources.