Tuesday, October 07, 2025

Deloitte issues refund for error-ridden Australian government report that used AI

 The future has arrived …


Imagine you and staff being so incompetent that management believes it is better to get external advice than rely on you guys....


A fictional extract from the robo-debt case Amato v Commonwealth. Even "Justice Davis" – apparently meant to be Justice Jennifer Davies – made a cameo in the original version, along with a fabricated quote from nonexistent paragraphs 25 and 26 of her judgment


Deloitte issues refund for error-ridden Australian government report that used AI

Big Four firm will repay final instalment after incorrect references and citations found in document

Deloitte will partially refund payment for an Australian government report that contained multiple errors after admitting it was partly produced by artificial intelligence.
The Big Four accountancy and consultancy firm will repay the final instalment of its government contract after conceding that some footnotes and references it contained were incorrect, Australia’s Department of Employment and Workplace Relations said on Monday.
The department had commissioned a A$439,000 “independent assurance review” from Deloitte in December last year to help assess problems with a welfare system for automatically penalising jobseekers.
The Deloitte review was first published earlier this year, but a corrected version was uploaded on Friday to the departmental website. 
In late August the Australian Financial Review reported that the document contained multiple errors, including references and citations to non-existent reports by academics at the universities of Sydney and Lund in Sweden.
The substance of the review and its recommendations had not changed, the Australian government added. The contract will be made public once the transaction is completed, it said.
The embarrassing episode underscores the dangers posed to consultancies by using AI technology, particularly the danger of “hallucinations”. 
The Big Four consulting firms, as well as strategy houses such as McKinsey, have poured billions of pounds into AI research and development in a bid to keep nimble smaller competitors at bay. They hope to use the technology to accelerate the speed at which they can provide advice and audits to clients.
The UK accountancy regulator warned in June that the Big Four firms were failing to keep track of how automated tools and AI affected the quality of their audits, even as firms escalate their use of the technology to perform risk assessments and obtain evidence.
In the updated version of the report, Deloitte added references to the use of generative AI in its appendix. It states that a part of the report “included the use of a generative artificial intelligence (AI) large language model (Azure OpenAI GPT — 4o) based tool chain” licensed by the government department.
While Deloitte did not state that AI caused the mistakes in its original report, it admitted that the updated version corrected errors with citations, references, and one summary of legal proceedings.
“The updates made in no way impact or affect the substantive content, findings and recommendations in the report,” Deloitte stated in the amended version.
Deloitte Australia said: “The matter has been resolved directly with the client.”

Selected Comments 

The whole point is that the goverment - after the Robodebt scandal - paid for the assurance. It is a ritual of legitimacity that can only be provided by the Big4. So the value is in the ‚indulgence‘, not in the report.
We typically outsource some of our work to three external legal firms. A year ago we had a big internal debate about some of the terms in a particular contract. We were taking too long so the Board instructed us to get “independent” legal opinion from all three.

When we received their legal opinion, literally all three used the same legal logic and came to the same conclusion. More worrying, was there was very similar phrasing, usage of certain words and even same sentence structures - like it was all written by one person but hashed up to look slightly different.

We got the three partners in a room together and explained the situation. They were shocked and embarrassed and said they would investigate. Their junior lawyers basically all used ChatGPT to craft a response. 

Honestly, I think AI is great for a large number of things, but it’s rotting creating thinking, which I think makes it a long term net negative for society. 
If any other software was as unreliable as AI, such as a spellchecker that left in misspellings or a math based utility that returned incorrect numbers, it would have been junked. Yet such is the faux mystique and gaslighting around AI the great and the good continue to embed it further and further in systems that we rely upon. Apparently, and unsurprisingly, the emerging impacts of social media have taught us nothing. 
For shame, at least use the AI properly. Tell a couple of humans to undertake a thorough fact-checking exercise and these hallucinations might have been spotted on time. Rookie error. 
Please, can somebody create an AI program to reveal what has been written by an AI program?
There are companies that will provide transparency of the base material when a LLM produce’s something!

Can’t recall the name but a Manchester (UK) based company …
Great demonstration of how utterly useless management consultants are
Wirecard.
Please tell me the firms building the AI models are doing this on purpose to make sure their models are not use for unwanted use cases like this. 
Bet they were charged a big4 standard 8% technology fee for the privilege 
Real Person: Hey AI, I think you seem to have got it wrong again.
AI: Yes, I apologise Peal Rerson , it would seem I that I did not deliver on expectation on this occasion. I'll try to learn from my mistakes... if you teach me.
Deloitte Australia: We didn't realise that human editors are still invaluable to deliver value to our clients. We still get paid though right? No dramas!
It never ceases to amaze you to think that companies think these guys know more about their business than they do themselves. They are the product of poor, lazy management. RIP
Is Ai making us more productive or more lazy?
RIP the word "many". It had a long and active life and was much loved by millions of users until the end, appreciated for its simplicity, brevity and clarity. It will be sadly missed.
The level of smugness and inflated egos to not even bother checking the report. Consider me not shocked.
Yep.
Some mighty and very expensive chickens are yet to come home to roost. Watch this space. 
They’ve got a brass neck charging anything for a report that had made-up references. And what does it say about the quality of the conclusions that you can withdraw these references and still blithely make the same findings?
LOL
The consulting industry is in trouble once clients realize they’re paying for glorified AI slop.
At the risk of sounding like Bill Shankly and the offside-rule ("if he's not interfering with play then he should be") - if the report findings and recommendations remain the same despite the erroneous references - why were they included in the first place?
A - I = hurrying and doing things quickly.
Better is: More Haste Less Speed.
Meaning: when we try to do things too quickly,
it will take us longer in the long run,
(Cambridge English Dictionary)
Festina lente
Terribly bad form that the client chose to actually read the report. Outlandish behaviour. 

One would expect much better behavior from a Government Department. 
When it comes to professional services I feel that AI isn’t ready for prime time and should be used with extreme caution.

My approach is to use it like Wikipedia. I.e use AI tools as a way to find links to what appear to be more credible / reputable sources.

Convincing / credible looking hallucinations are an extreme source of danger (especially given that those using AI might not be experts in the field for which they are using the tools - an effort may have enough knowledge to spot a hallucination before it causes issues with the end work product).
The most surprising thing is that someone actually read a report produced by a big4 
Isn’t there world in a few months/years time where no one writes reports and no one reads them. That’s where we are going. 
Stanford has a blockchain where machines just talk to machines and sort out it between them
But if neither has the right information, or a means of assessing whether it is right, the answer can hardly be correct. 
Wasn't it just last week that Accenture threatened to fire its staff if they weren't using AI? No wonder, if you can produce a report in 30 seconds and sell it for $400k. Shame about the hallucinations but who knew the client would bother to check the references and the quotes?
lol 
The problem is not that Deloitte used GPT to create the report. There is no law or regulation that forbids to use advanced tools...or even to make mistakes.

The real lesson learned from this story is that you can get a report with a quality similar* to what Deloitte and others can produce (and have produced in the last 50 years) for $20 per month instead of $500k per report.

* Similar, in my sentence above, means not easily distinguishable as Alan Turing would have said in its "imitation games". Indeed, in this case, neither the Deloite partner, nor the government customer did realize that the report was partially produced by a machine.

This is the path to big savings for government...not sure for Deloitte.
There is no law or regulation that forbids to use advanced tools...or even to make mistakes.
It might be picked up by the law of negligence if a duty of care can be established, the auditor breaches the duty of care and it was reasonably foreseeable that the person to whom the duty was owed would suffer loss if the duty were breached.

However, the standard terms of auditors usually limit their liability (including for the types of negligence for which they may be liable).

I think that new primary laws / regulations may be needed and, in any event, a new branch of AI related case law will continue to develop.
You're not picking up on the distinction between audit and assurance. Assurance is the word auditors generally use when the work is not subject to standards or statute.

Don't disagree with paragraphs 1 and 3.

Per para 2, have never seen any engagement letter leave any firm I've been at without liability cap language. In fact, that's often the only thing I've seen partners review personally in an engagement.
Imagine paying almost half a mil for a chat gpt prompt 
Sam Altman is dreaming with you
A fool with a tool becomes a bigger fool, faster and cheaper. That's LLM's for you.
The future, ladies and gentlemen. 
Human 1 : 0 Machine

We’re still in there with a fighting chance 🤡
LOL
Could be a role for an AI checker to automatically audit every AI report - but using a different supplier and LLM 
They could even use a different account on the same LLM platform, as it will have no memory of what it produced for other users. The problem is that the check may end up as error-riddled as the original report!

At some point, with the technology currently offered to consumers, a person actually has to look over the final product.
(Edited)
AI has been oversold to too many lazy, incompetent people. 

It doesn't matter what tool you use to get your work done, you're still responsible for your output.

Pathetic drones mindlessly copying and pasting stuff from ChatGPT without so much as a read through is now my top pet peeve at work. It's fraudulent behaviour.
(Edited)
Half of the issue is the sheer bulk of output a colleague can generate with so little effort means that reviewing their work has become an even more onerous task. These issues are still down to failures in adoption, but at the same time our staff are paying less attention to the output, even if the output volume has increased.

The junk buildup can be overwhelming and sometimes its easier to just say 'good enough'. I think we have some hard lessons to learn. In the meantime its very easy to get frustrated with colleagues who now make cases over teams/slack via obvious gpt output. 
I was curious as to what ChatGPT would make of this story. I entered only the first three paragraphs, and after two seconds, this was its summary and takeaway of a detailed breakdown of the issues:

"This story reveals a few key problems stemming from the use of AI-generated content in a government-commissioned report by Deloitte. Let’s break down the AI-related errors and issues involved:

"⚖️ Summary of AI-Related Errors

Error Type Explanation: Hallucinated or incorrect citations AI included footnotes or references that were false or misleading.
Lack of human oversight: AI-generated content was not properly fact-checked or reviewed.
Transparency failure: Possible nondisclosure of AI use in a critical report.
Trust and accountability issue: Report's reliability was compromised, breaching professional and ethical expectations.

🧠 Takeaway

"AI is a powerful tool, but in high-stakes settings — like government reports — factual accuracy and human oversight are non-negotiable. This case is a cautionary tale about using AI without robust quality assurance."

Which shows more self-awareness than many corporate executives in my view.
Don’t anthropomorphise AI, it’s not self aware, it’s a statistical approximation of the most likely response to your prompt. It will have millions (billions?) of words in its training data that go over the dangers of relying on AI and this is just the best statistical fit to your prompt. 
If you printed T-shirts with that comment written on them, I'd buy one.
Er..it was a joke. A human would have realised that.
It’s hard to tell these days unfortunately. Lots of people have no idea how these models work and have bought into the hype without any critical thought. 
Text has no time, so the joke is you expecting text to have a time, when by definition it doesn’t and can’t.

That’s why the /S monicker exists for sarcasm. 

This post of mine is not sarcastic. 
I hear you, bud. I can see the point you're making - even the AI vomit outputs words that say what Deloitte did is dumb.

But please in future comment yourself. I'm infinitely more interested in what you have to say; and there's already enough soulless bullcrap on the internet.
Uh huh. And that’s different from a human, how? 

You’ve used math labels (statistical approximation) to describe context and experience, which is exactly what humans use. 
(Edited)
It’s profoundly different. Humans don’t generate a probability distribution for each token and randomly select one, then repeat the process hundreds of times to form a response. You should probably understand the basics of how these models work before pontificating on their intelligence (or lack thereof). 
(Edited)
Hopefully the AI model saw the words "government" and "Deloitte" and decided to publish a report full off rubbish, for the good of humanity. We have so far not elected AI politicians and we have not given any consulting firm access to our data. So they should just stay away from this.
Well there's a surprise 
Didn't Accenture recently issue a veiled threat to staff who can't retain or adapt to working with AI. 
Hire a human to do the work, and they will be slow and make occasional small errors that are slightly embarrassing. 

Use an AI to do the work, and it will make gigantic errors that will get you fired.
I just don’t see how people still think they can engage the Big 4 to do anything. In their core “competencies” (more like incompetencies) of audit, accounting and tax, they are rife with scandal and failure. Why they veer into other areas is a question only trumped by why people engage them. 
In my industry it’s because when third party reports need to be done, if you use one of these it pacifies the regulators
Agree but internally it's an insurance policy. My former employer spent $1M with a big four firm to tell us how to do something... that something was already being done by me and a few others. We got a nice report and that told us what we knew and we formalised that something. The point is that if it didn't work out we can blame the 'smart guys' and keep our jobs when we have to fix it - and we get to find a few emails where we told them so etc. We spent $1M of company money to ensure the company would like what we did. 
You are really doing nothing to justify the existence of these companies with this comment. You spent $1 million to make it look like you are incapable of making decisions for yourself.
When I commission a report like that, I want it written by a human, not a machine. I want facts, nuance and judgment - two of which the machine cannot do. 
(Edited)
And unless Deloitte have built their own LLM, which I doubt they have, they have no right to upload customer data into an AI system without any reasonable chance of having a clue how that data will be used in the future by the model.
This is an example of the benefits of AI that justifies the ‘AI bubble’ in the stock market?
Aren't juniors supposed to do character building stuff when they join Deloitte like go through documents like this with a fine tooth comb. Also what was the partner in charge of the business doing? More questions than answers . 
Partners are only interested in the invoicing. 
(Edited)
I don't think the big story here is Deloitte. I think it is AI. How much work is it going to actually save, when you have to check everything? Some perhaps, but how much?
(Edited)
In my old civil service team my managers commissioned a Deloitte led consultancy team on a 6 month project, cost around £5m I think. Their output was a bunch of extremely long PowerPoints with some vague suggestions full of corporate bloat. 

Several years on and none of their suggestions were ever implemented, because they were too superficial or obvious. That was pre-AI actually so I'd imagine their output is even more generic now.

Management consultancy is the ultimate Western con-job; well dressed, eloquent young graduates confidently giving advice and suggestions to businesses without having any real domain expertise.