Doctor vs. ChatGPT showed promise and blind spots of generative AI

Generative AI tools, also known as large language models, are already helping doctors transcribe visits and summarize patient records. The technology behind ChatGPT, trained on vast amounts of data from the internet, made headlines when it correctly answered more than 80% of board exam questions. In July, a team at Beth Israel saw promising results when using GPT-4 during a diagnosis workshop for medical residents.

But the tool is by no means ready for primetime. When given questions about real-life medical scenarios by Stanford researchers, GPT frequently disagreed with humans or offered irrelevant information. The AI models are prone to “hallucinating,” or making things up because they sound right — a tendency that could cause immeasurable harm if let loose on patients. AI leaders and policymakers alike have called for more regulations.

Read the rest…

Read Original Article: Doctor vs. ChatGPT showed promise and blind spots of generative AI »