TY - JOUR
T1 - Large language model as a clinical decision support tool in the initial management of critically ill children
T2 - a pilot evaluation
AU - Tausky, Osnat
AU - Kaplan, Eytan
AU - Kadmon, Gili
AU - Gendler, Yulia
AU - Nahum, Elhanan
AU - Yitzhaki, Shai
AU - Weissbach, Avichai
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
PY - 2025/12
Y1 - 2025/12
N2 - Large language models (LLMs) like ChatGPT are being explored as clinical decision support tools, but their reliability in pediatric acute care remains uncertain. This pilot study assessed ChatGPT-4.0’s performance in the early management of critically ill children using real-world clinical data. We retrospectively analyzed 20 children emergently admitted from the emergency department (ED) to a tertiary pediatric intensive care unit (PICU). ChatGPT-4.0 was prompted at four time points: ED arrival (diagnostic and therapeutic plans), ED transfer (differential diagnosis and hospitalization decision), PICU admission (diagnostic and therapeutic plans), and 24 h into PICU stay (differential diagnosis). Outputs were compared to actual care and evaluated for accuracy, safety, and omissions. At ED and PICU admission, 94% (95% CI, 91–97%) and 98% (95% CI, 95–99%) of diagnostic recommendations were rated as appropriate. Only 82% (95% CI, 76–87%) of therapeutic recommendations were considered appropriate at both points (p < 0.001). Potentially harmful therapeutic suggestions were more common than diagnostic ones: 7% vs. 2% in the ED (p = 0.016) and 10% vs. 0% in the PICU (p < 0.00001). In the PICU, critically missing therapeutic recommendations occurred at 0.95 per case, compared to 0.15 for diagnostic ones (p = 0.0073). The correct diagnosis appeared in 100% of ED discharge and 95% (95% CI, 85–100%) of PICU 24-h differentials. Triage decisions were accurate in all PICU cases. Conclusion: ChatGPT-4.0 showed good diagnostic and triage performance but requires caution, especially for therapeutic decisions and broader pediatric use. (Table presented.)
AB - Large language models (LLMs) like ChatGPT are being explored as clinical decision support tools, but their reliability in pediatric acute care remains uncertain. This pilot study assessed ChatGPT-4.0’s performance in the early management of critically ill children using real-world clinical data. We retrospectively analyzed 20 children emergently admitted from the emergency department (ED) to a tertiary pediatric intensive care unit (PICU). ChatGPT-4.0 was prompted at four time points: ED arrival (diagnostic and therapeutic plans), ED transfer (differential diagnosis and hospitalization decision), PICU admission (diagnostic and therapeutic plans), and 24 h into PICU stay (differential diagnosis). Outputs were compared to actual care and evaluated for accuracy, safety, and omissions. At ED and PICU admission, 94% (95% CI, 91–97%) and 98% (95% CI, 95–99%) of diagnostic recommendations were rated as appropriate. Only 82% (95% CI, 76–87%) of therapeutic recommendations were considered appropriate at both points (p < 0.001). Potentially harmful therapeutic suggestions were more common than diagnostic ones: 7% vs. 2% in the ED (p = 0.016) and 10% vs. 0% in the PICU (p < 0.00001). In the PICU, critically missing therapeutic recommendations occurred at 0.95 per case, compared to 0.15 for diagnostic ones (p = 0.0073). The correct diagnosis appeared in 100% of ED discharge and 95% (95% CI, 85–100%) of PICU 24-h differentials. Triage decisions were accurate in all PICU cases. Conclusion: ChatGPT-4.0 showed good diagnostic and triage performance but requires caution, especially for therapeutic decisions and broader pediatric use. (Table presented.)
KW - Artificial intelligence
KW - ChatGPT
KW - Clinical decision support
KW - Emergency medicine
KW - Large language models
KW - Pediatric intensive care
UR - https://www.scopus.com/pages/publications/105021821939
U2 - 10.1007/s00431-025-06630-7
DO - 10.1007/s00431-025-06630-7
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 41238850
AN - SCOPUS:105021821939
SN - 0340-6199
VL - 184
JO - European Journal of Pediatrics
JF - European Journal of Pediatrics
IS - 12
M1 - 757
ER -