This study examines the use of an AI-powered virtual assistant for quickly identifying and handling neurological emergencies, particularly in places with limited medical resources. The research aimed to check if this AI tool is safe and accurate enough to move on to more advanced testing stages. In a first-of-its-kind trial, the virtual assistant was tested with patients having urgent neurological issues. Neurologists first reviewed the AI's recommendations using clinical records and then assessed its performance directly with patients. The findings were as follows: neurologists agreed with the AI's decisions nearly all the time, and the AI outperformed earlier versions of Chat GPT in every tested aspect. Patients and doctors found the AI to be highly effective, rating it as excellent or very good in most cases. This suggests the AI could significantly enhance how quickly and accurately neurological emergencies are dealt with, although further trials are needed before it can be widely used.
Background and Objectives: Neurological emergencies pose significant challenges in medical care, especially in resource-limited countries. Artificial Intelligence (AI), particularly health chatbots, offers a promising solution. However, rigorous validation is required to ensure safety and accuracy. The objective of our work is to evaluate the diagnostic accuracy and resolution effectiveness of an AI-powered virtual assistant designed for the triage of emergency neurological pathologies, to ensure the minimum standard of safety that allows for the progression to successive validation tests. Methods: This Phase 1 trial evaluates the performance of an AI-powered virtual assistant for emergency neurological triage. Ten patients over 18 years old with urgent neurological pathologies were selected. In the first stage, nine neurologists assessed the safety of the virtual assistant using their clinical records. In the second part, the assistant's accuracy when used by patients was evaluated. Finally, its performance was compared with Chat GPT 3.5 and 4.
Stage 1 focused on safety, using only medical information from clinical records for the virtual assistant. In Stage 2, which evaluated accuracy, participants interacted with the virtual assistant post-medical stabilization. Additionally, participants also provided initial symptom details for Chat-GPT input. Nine neurologists specializing in emergency participated in the study. In Stage 1, they assessed the virtual assistant's performance using clinical history information. In Stage 2, they analyzed the results from participant interactions with the assistant and performed a comparative evaluation of Chat-GPT. The virtual assistant functioned as a chatbot on WhatsApp and Telegram, using Spanish and incorporating advanced algorithms, decision trees, and large language models for interaction. For comparison, we utilized Chat-GPT versions 3.5 and 4, employing two prompt types in natural Spanish: one incorporating clinical record data and the other based on participant narratives.
Buenos Aires, Argentina