Reliability of ChatGPT-4o in analysing medical data: a test case study on patients at risk for limb amputation

  • Liat Toderis
  • , Iris Reychav
  • , Roger McHaney
  • , Bernice Oberman
  • , Chen Speter
  • , Ronen Loebstein

Research output: Contribution to journalArticlepeer-review

Abstract

Purpose: This research investigates ChatGPT-4o reliability in analyzing medical data for diabetic patients at risk of limb loss. It evaluates whether a generative AI tool can serve as a viable alternative to traditional statistical methods for predictive medical analysis. The research question is: How does ChatGPT-4o perform in answering predictive questions about patient outcomes compared with a professional statistician using conventional tools? Methods: Data were drawn from Sheba Medical Center’s diabetic foot clinic, focusing on mortality and amputation risk. ChatGPT-4o’s predictive responses were compared with those produced by a professional statistician. The study emphasized the importance of prompt design and required substantial human involvement in data cleaning to ensure accuracy. Results: ChatGPT-4o produced accuracy comparable to traditional statistical methods when prompts were well-designed. Findings highlight the central role of prompt engineering in obtaining reliable outputs. Human intervention in preparing the dataset remained necessary, underscoring current limitations in fully automating the process. Conclusion: The study demonstrates the potential of generative AI—specifically ChatGPT-4o—as a tool enabling clinicians to analyse medical data without advanced technical training. With proper instruction and careful prompt engineering, generative AI can help democratize access to predictive medical analysis as a user-friendly alternative to conventional methods.

Original languageEnglish
JournalHealth Systems
DOIs
StateAccepted/In press - 2025

Keywords

  • ChatGPT-4
  • Diabetic limb risk
  • Medical data analysis
  • Predictive healthcare
  • Prompt engineering

Fingerprint

Dive into the research topics of 'Reliability of ChatGPT-4o in analysing medical data: a test case study on patients at risk for limb amputation'. Together they form a unique fingerprint.

Cite this