Table 1. Scores by Large Language Models.

From: Beyond the Pass Mark: Accuracy of ChatGPT and Bing in the National Medical Licensure Examination in Japan

Characteristic Compulsory questions, N = 811 Others, N = 2001 Total, N = 281
Bing score 70 (86%) 149 (74%) 219 (78%)
ChatGPT score 34 (42%) 72 (36%) 106 (38%)
Table 2. Reasons for Wrong Answers by Bing.

From: Beyond the Pass Mark: Accuracy of ChatGPT and Bing in the National Medical Licensure Examination in Japan

Characteristic Normal questions, N = 311 Clinical questions, N = 311
Misinterpretation of the meaning of the question statement 10 (32%) 7 (22.6%)
Wrong diagnosis 0 (0%) 3 (9.7%)
Wrong information with a correct diagnosis 0 (0%) 4 (13%)
Wrong information 21 (68%) 17 (55%)
PAGE TOP