Abstract
Purpose: This research investigates ChatGPT-4o reliability in analyzing medical data for diabetic patients at risk of limb loss. It evaluates whether a generative AI tool can serve as a viable alternative to traditional statistical methods for predictive medical analysis. The research question is: How does ChatGPT-4o perform in answering predictive questions about patient outcomes compared with a professional statistician using conventional tools? Methods: Data were drawn from Sheba Medical Center’s diabetic foot clinic, focusing on mortality and amputation risk. ChatGPT-4o’s predictive responses were compared with those produced by a professional statistician. The study emphasized the importance of prompt design and required substantial human involvement in data cleaning to ensure accuracy. Results: ChatGPT-4o produced accuracy comparable to traditional statistical methods when prompts were well-designed. Findings highlight the central role of prompt engineering in obtaining reliable outputs. Human intervention in preparing the dataset remained necessary, underscoring current limitations in fully automating the process. Conclusion: The study demonstrates the potential of generative AI—specifically ChatGPT-4o—as a tool enabling clinicians to analyse medical data without advanced technical training. With proper instruction and careful prompt engineering, generative AI can help democratize access to predictive medical analysis as a user-friendly alternative to conventional methods.
| Original language | English |
|---|---|
| Journal | Health Systems |
| DOIs | |
| State | Accepted/In press - 2025 |
Keywords
- ChatGPT-4
- Diabetic limb risk
- Medical data analysis
- Predictive healthcare
- Prompt engineering