"Should AI write medical reports?"
That was the provocative question posed by Zack Akil in a Shoreditch loft space last Thursday.
After 3 hours of back-to-back criticism of AI hallucinations and factual inaccuracies - it was unsurprising that the audience responded with...
"No way!"
This was the setup for the second most interesting talk at Monki Gras 2024: Is your GenAI app a "dud" or "straight to prod"?
With the hype and hysteria surrounding AI, it is an essential question to ask.
Get it wrong, and you will be on the receiving end of headlines like...
Litigant unwittingly put fake cases generated by AI before tribunal, and AI is creating fake legal cases and making its way into real courtrooms, with disastrous results.
I struggle with the idea that people use tools like ChatGPT and Google Gemini as a source of truth.
I find it absurd.
As I've written, generative AI and Large Language Models - such as ChatGPT and Google Gemini - use a simple trick.
They guess what comes next.
There is no comprehension of facts within their core data.
So how can it be possible that AI would be suitable to speak to your customers - on your behalf - without supervision?
Zack proposed a simple question:
Can the user immediately validate the correctness of the output?
This is a neat trick to move the correctness of AI-generated content away from the AI and onto the person using the AI as a tool.
Zack suggested there were 3 possible solutions to this question:
1) Make the data pre-verified. This approach is ideal when working with internal document libraries and requires two steps:
2) Make the data post-verified. This approach is ideal for internet-facing AI applications and is the way Google Gemini operates:
3) Change the audience. Suppose the audience using the AI-generated content cannot immediately verify the correctness of the information. In that case, they are the wrong audience - so change them.
This final answer felt like a mealy-mouthed bait-and-switch. However, Zack had a brilliant example.
Throughout the presentation, Zack was hobbling around the stage on crutches. He explained that he had injured his leg the previous weekend, and the doctor's diagnosis was:
After reviewing the patient's diagnostic images, there is a complete tear of the ACL with contusions in the femoral and tibial areas. There is a partial rupture of the MCL and meniscal oedema. There is no evidence of damage to the PCL or LCL.
After two long days of waiting, a physiotherapist explained what this meant. Zack speculated that there could be a better way of explaining their diagnosis to worried patients...
An AI system could translate the above jargon-filled diagnosis into:
After looking at the pictures we took of the inside of your knee, I can see the ACL, which is a really important rope-like part that helps hold your knee together, is completely torn. Also, there are some bruises inside the bone at the top and bottom of your knee. You've got a small tear in your MCL too, which is another important band on the inside of your knee that helps keep it stable. Plus, there's some swelling in the meniscus, which is like a cushion inside your knee. Luckily, the other ropes in your knee, the PCL and LCL, which help hold your knee together from the back and the outside, look just fine and aren't hurt.
As patients, we are unable to validate the correctness of this translation. However, a doctor can quickly scan the paragraph and confirm the essence is correct.
But, by changing the audience from the patient to the doctor, the IVO framework question could be satisfied.
Thus - the audience concluded AI should write medical reports… whenever possible.