A 237-page report delivered in mid-2025 by Deloitte Australia to the Australian government's DEWR was supposed to review the "Future Made in Australia" welfare compliance framework and its IT system. Commissioned for $440,000 AUD, the report promised to offer expert insights into automated welfare compliance enforcement. But it became one of Australia's biggest AI controversies, due to numerous fabricated academic references, fake expert quotes, and factual inaccuracies generated by AI.
Shortly after the report was released in July 2025, Dr. Christopher Rudge, a welfare law academic from Sydney, found serious mistakes. The report referenced several academic papers that did not exist and included a fake quote from the Federal Court. These obvious problems were not just minor typos; they showed signs of AI hallucinations, which occur when generative AI tools produce information that seems credible but is actually false.
Deloitte later admitted that it had used Azure OpenAI GPT-4o during the early drafting of the report. The consulting giant said that human experts carefully reviewed and improved the content before final delivery. Despite this, more than a dozen made-up references remained in the published version. After these findings, the Australian government required changes to the report, which removed the false citations and fixed typographical errors. Deloitte agreed to partially refund the government as a result.
AI hallucinations pose serious risks when organizations rely heavily on generative AI to assist in critical decision-making and reporting. These errors can damage credibility, lead to faulty policy decisions, and undermine public trust. The Deloitte scandal has made it clear that while AI tools can make the processes of research and creation quicker and more efficient, human intervention is necessary to confirm accuracy and ensure that AI is used ethically.
The Australian government tightened AI usage policies in consultancy contracts to ensure more transparency and accountability. The incident was a PR setback for Deloitte, but also served as a reminder for consulting firms on the tasks ahead in using AI responsibly. "And that's what that report shows," Dr Rudge said, "the need for clear guardrails around the use of AI - preventing hallucinations, improving the governance of methodologies assisted by AI."
It is a wake-up call to governments and corporations globally:
The Deloitte case is by no means an isolated incident but one of the first high-profile examples of the limitations of generative AI in professional contexts. It illustrates that unbridled reliance on AI may lead to costly mistakes and reputational damage.