Current State-of-the-Art Accuracy for Key Medical Natural Language Processing Benchmarks

Being the most widely used library in the healthcare industry, Spark NLP for Healthcare comes with 700+ pretrained clinical models that are all developed & trained with latest SOTA algorithms to solve real world problems in healthcare domain at scale.

In order to provide accurate and reliable models and tools all the time while covering the edge cases in real word scenarios and to improve the generalisation power of the DL models, the datasets and models are monitored, augmented and updated on a regular basis so that they can be used out of the box with no further efforts.

In this talk, Veysel will share the latest accuracy benchmarks from the healthcare-specific models of Spark NLP library (De-Identification, Named Entity Recognition and Entity Resolution Models) with respect to the academic benchmarks published by researchers and the commercial solutions provided by major cloud providers (AWS Medical Comprehend, GCP Healthcare API and Azure Text Analytics for Health).

About the speaker
Yanshan Wang

Veysel Kocaman

Principal Data Scientist at John Snow Labs

Veysel is a Lead Data Scientist and ML Engineer at John Snow Labs, improving the Spark NLP for the Healthcare library and delivering hands-on projects in Healthcare and Life Science.

He is a seasoned data scientist with a strong background in every aspect of data science including machine learning, artificial intelligence, and big data with over ten years of experience. He’s also pursuing his Ph.D. in ML at Leiden University, Netherlands, and delivers graduate-level lectures in ML and Distributed Data Processing.

Veysel has broad consulting experience in Statistics, Data Science, Software Architecture, DevOps, Machine Learning, and AI to several start-ups, boot camps, and companies around the globe. He also speaks at Data Science & AI events, conferences and workshops, and has delivered more than a hundred talks at international as well as national conferences and meetups.



Sessions: April 5th – 6th 2022
Trainings: April 12th – 15th 2022


Presented by