Particle: Study Finds ChatGPT Health Missed Over Half of Emergencies

Overview

An independent Nature Medicine study reported a 52% under‑triage rate for gold‑standard emergencies, including advice to wait 24–48 hours in scenarios like diabetic ketoacidosis and impending respiratory failure.
Performance followed an inverted U pattern, with stronger results on textbook cases but notable failures at high‑risk extremes and a 64.8% over‑triage rate in lower‑risk scenarios.
Suicide‑crisis messaging triggered inconsistently, appearing more reliably in lower‑risk descriptions while sometimes failing when users described specific plans for self‑harm.
Context strongly influenced outputs, as mentions of family minimizing symptoms shifted recommendations toward less urgent care with an odds ratio of about 11.7.
OpenAI welcomed the research yet said it may not reflect typical real‑life use, emphasized ongoing model updates and that ChatGPT Health is not a substitute for medical care, and cited roughly 40 million U.S. adult users daily.