Beyond QA: A Multifaceted Evaluation of John Snow Labs’ Medical Chatbot
With more than one million new biomedical research papers published per year, providing everyone with fast & reliable answers based on current scientific knowledge is paramount. This session by Veysel Kocaman, the Head of Data Science at John Snow Labs, presents a comprehensive evaluation of how well the company’s state-of-the-art Medical Chatbot meets that goal. The evaluated tasks include question answering on:
- Prebuilt knowledge bases such as Pubmed, MedArxiv, and Clinical Trials.
- User-specific documents, such as internal & confidential files.
- Structured data in relational databases, without users resorting to SQL or BI tools.
The evaluation covers these dimensions:
- Truthfulness: Ensuring answer fidelity, prioritizing trustworthy sources, and negating hallucinations.
- Accuracy: Delivering higher accuracy compared to general-purpose LLMs and vector databases with bespoke document splitting, reranking, post-filtering, and similarity search.
- Explain-ability: Citing resources while generating answers.
- Security: Supporting air-gapped deployment, functioning securely on-premise without requiring internet connectivity or external API calls.
- Expert Preference: A team of medical doctors applied a specialized methodology to evaluate the generated answers on relevance, style, consistency, and appropriateness.
- Recall: Medical doctors evaluated not only the generated answers, but also whether there exists additional research that can provide a more recent or complete answer.
- Latency: The speed of building & updating the body of knowledge, calculating embeddings, and running inference to answer user questions.
Gain insights into the capabilities, benchmarks, and applications of the Medical Chatbot, shaping the future of healthcare data retrieval and decision-making support.
Head of Data Science at John Snow Labs
Veysel is a Lead Data Scientist and ML Engineer at John Snow Labs, improving the Spark NLP for the Healthcare library and delivering hands-on projects in Healthcare and Life Science.
He is a seasoned data scientist with a strong background in every aspect of data science including machine learning, artificial intelligence, and big data with over ten years of experience. He’s also pursuing his Ph.D. in ML at Leiden University, Netherlands, and delivers graduate-level lectures in ML and Distributed Data Processing.
Veysel has broad consulting experience in Statistics, Data Science, Software Architecture, DevOps, Machine Learning, and AI to several start-ups, boot camps, and companies around the globe. He also speaks at Data Science & AI events, conferences and workshops, and has delivered more than a hundred talks at international as well as national conferences and meetups.