ChatGPT: An Evaluation of AI-Generated Responses to Commonly Asked Pregnancy Questions
ChatGPT: An Evaluation of AI-Generated Responses to Commonly Asked Pregnancy Questions作者机构:Department of Obstetrics and Gynecology Jersey Shore University Medical Center Neptune NJ USA
出 版 物:《Open Journal of Obstetrics and Gynecology》 (妇产科期刊(英文))
年 卷 期:2023年第13卷第9期
页 面:1528-1546页
学科分类:1002[医学-临床医学] 100211[医学-妇产科学] 10[医学]
主 题:AI (Artificial Intelligence) ChatGPT Pregnancy
摘 要:Background: A recent assessment of ChatGPT on a variety of obstetric and gynecologic topics was very encouraging. However, its ability to respond to commonly asked pregnancy questions is unknown. Reference verification needs to be examined as well. Purpose: To evaluate ChatGPT as a source of information for commonly asked pregnancy questions and to verify the references it provides. Methods: Qualitative analysis of ChatGPT was performed. We queried ChatGPT Version 3.5 on 12 commonly asked pregnancy questions and asked for its references. Query responses were graded as “acceptable or “not acceptable based on correctness and completeness in comparison to American College of Obstetricians and Gynecologists (ACOG) publications, PubMed-indexed evidence, and clinical experience. References were classified as “verified, “broken, “irrelevant, “non-existent or “no references. Review and grading of responses and references were performed by the co-authors individually and then as a group to formulate a consensus. Results: In our assessment, a grade of acceptable was given to 50% of responses (6 out of 12 questions). A grade of not acceptable was assigned to the remaining 50% of responses (5 were incomplete and 1 was incorrect). In regard to references, 58% (7 out of 12) had deficiencies (5 had no references, 1 had a broken reference, and 1 non-existent reference was provided). Conclusion: Our evaluation of ChatGPT confirms prior concerns regarding both content and references. While AI has enormous potential, it must be carefully evaluated before being accepted as accurate and reliable for this purpose.