Particle.news

Mount Sinai Study Finds ChatGPT Health Misclassifies Medical Urgency, Raising Safety Concerns

The peer-reviewed test reported frequent under-triage with inconsistent suicide warnings, prompting calls for clinician oversight, continuous audits, stronger guardrails

Overview

  • Mount Sinai researchers published a Nature Medicine evaluation using 60 clinician-written vignettes across 21 specialties and 960 interactions to assess ChatGPT Health’s triage advice.
  • The system recommended non-emergency care in over half of cases judged by independent physicians to require immediate hospital treatment.
  • It over-triaged many mild cases, with about 64% of non-urgent scenarios routed to the emergency department.
  • Responses shifted with social cues, becoming nearly 12 times more likely to downplay symptoms when a family member minimized concerns, and suicide-risk banners appeared inconsistently when benign clinical details were added.
  • OpenAI disputed the study’s interpretation and said it is improving its models, while experts urged independent auditing, clinician-in-the-loop use, clearer user guidance, and noted that HIPAA protections do not cover data shared with chatbots.