Home>Messages>Transcription tool hallucinates on the fly

Transcription tool hallucinates on the fly

dutchhealthhub

October 31, 2024

2 min

Transcription models that automatically summarize conversations with patients and turn them into text. It's the dream of many a healthcare professional. Thanks to AI tools like Whisper, that dream is becoming a reality. At least if "fabrications, racist comments, insults and nonsensical medical advice" don't throw a spanner in the works.

"Robust and almost as meticulous as a human." That's how OpenAI sells its transcription model Whisper. There is some question about that, according to recent research reported by AP news agency. This research shows that Whisper makes up words into whole phrases. These fabrications can even turn into "racist comments, insults and nonsensical medical advice ," according to experts surveyed by AP .

High-risk environment

Of course, OpenAI also knows that Whisper sometimes hallucinates on the fly. For that reason, the tech company advises against its use in high-risk environments such as healthcare. Yet this advice is being ignored by healthcare providers. AP sees "great haste among U.S. healthcare providers to deploy Whisper-based transition tools."

Hallucinations as an essential feature

Whisper is by no means the only AI model that struggles with accurate representation of human communication. According to Leiden University professor of Natural Language Processing (NLP) Suzan Verberne, hallucinations are an essential feature of language models. "Hallucination is not a bug but a feature," Verberne said. "What the model does is generate a plausible set of likely word sequences. The more specific the topic, the more likely hallucinations are, because information on such topics is more limited."

Continuous monitoring

This observation is given added relief when one considers that healthcare is a sector in which hyperspecialization is on the rise. Healthcare professionals who do start working with transcription models are forced to adjust and clean up the model considerably. There must also be continuous monitoring of the veracity of the representations. And so the intended time and efficiency gains can just go up in smoke. If ailing transcription models are already allowed to be used in healthcare when the AI Act will soon be in full force.