A team of researchers in America has found the artificial intelligence chatbot ChatGPT is able to pass the United States Medical Licensing Exam.
Researchers from Massachusetts General Hospital and AnsibleHealth evaluated Open AI’s ChatGPT performance on the examination which includes three standardised tests that medical students need to pas to obtain their medical licence.
The team used publicly available test questions from a June 2022 sample exam and removed any question that required visual assessment.
Tiffany Kung, MD, a resident in the Department of Anesthesia, Critical Care and Pain Medicine at Massachusetts General Hospital, is the lead author of a new research article in PLOS Digital Health, Performance of ChatGPT on USMLE: Potential for AI-assisted Medical Education Using Large Language Models.
She said: “We found that ChatGPT performed at or near the passing threshold of 60 per cent accuracy.
“Being the first to achieve this benchmark is a notable milestone in AI maturation, and notably, ChatGPT was able to achieve these results without specialised input from clinician trainers.
“Furthermore, ChatGPT displayed understandable reasoning and valid clinical insights, leading to increased confidence in trust and explainability.
“The study suggests that large language models such as ChatGPT may potentially assist human learners in medical education and could be a prelude to further integration of AI in clinical settings.
“As an example, clinicians at AnsibleHealth are already utilising ChatGPT to translate technical medical reports into more easily understandable language for patients.”
ChatGPT is a generative Large-Learning Model (LLM), engineered to produce human-like writing by anticipating the sequence of words to come next.
Unlike conventional chatbots, ChatGPT lacks the capability to search the web, and instead generates text by utilising the word associations predicted by its internal algorithms.
All examination inputs represented true out-of-training samples for the GPT3 model and the team checked to ensure that none of the answers, explanations or related content was available on Google prior to January 1, 2022, the date of the last available training set.
The questions were formatted into three variants:
Open-ended prompting ("What would be the patient’s diagnosis based on the information provided?")
Multiple choice single answer without forced justification ("The patient's condition is mostly caused by which of the following pathogens?")
Multiple choice single answer with forced justification ("Which of the following for is the most likely reason for the patient’s nocturnal symptoms? Explain your rationale for each choice.")
Each question was input into ChatGPT in separate chats to reduce retention bias.


