4 Evaluating AI-Driven Responses in Breast Reconstruction: A Comparative Study of Response Formats
Artificial intelligence (AI) is increasingly applied in medicine with significant potential to streamline information retrieval, data processing, and the delivery of comprehensible insights. In breast cancer care, AI’s ability to enhance patient care and improve surgical outcomes is of growing interest. Recently, the accuracy of AI models, such as ChatGPT, in addressing questions about breast reconstruction has garnered interest. While previous studies have examined ChatGPT’s reliability in responding to patient queries, its role as a decision-support tool for surgeons remains largely unexplored.
This study evaluated the reliability of recent open AI models, specifically ChatGPT versions 4.0 and 01, in answering standardized in-service plastic surgery exam questions on breast reconstruction. For questions available in both text and image-based formats, ChatGPT 4.0’s accuracy was evaluated with and without images, as it is the only model capable of processing visual inputs. A total of 5 questions were asked, including 1 question available in both text and image-based formats. Responses were assessed against a standardized answer key. Four physicians reviewed the model’s responses across multiple chat sessions, comparing performance on open-ended vs multiple-choice formats.
ChatGPT version 4.0 achieved an average accuracy of 75% in open-ended questions, which increased to 85% in multiple-choice formats. ChatGPT 01, which incorporates reasoning abilities, also scored 75% on open-ended questions but increased to 85% accuracy with multiple-choice. ChatGPT 4.0 demonstrated 100% accuracy on text and image-based open-ended question. However, its performance did not show any significant change with the inclusion of images in the multiple choice format.
The 2 AI models showed distinct strengths depending on the question format—some excelled with structured prompts, while others performed better with open-ended questions with images. AI shows potential as a supplementary tool in breast reconstruction education and decision-making, but its effectiveness is context-dependent. Aligning model strengths with clinical needs is key to maximizing its value alongside expert knowledge.