Deep Learning for Relation Extraction from Clinical Documents

Unstructured free-text medical notes are the only source for many critical facts in healthcare. As a result, accurate natural language processing (NLP) is a critical component of many healthcare AI applications such as clinical decision support, clinical pathway recommendation, cohort selection, patient risk or abnormality detection. Recent advances in deep learning for NLP have enabled a new level of accuracy and scalability for clinical language understanding, making a broad set of applications possible for the first time.

In the first part of the talk, Dr. Sharma  will cover the deep learning techniques, explainability features, and how and where NLP pipeline architecture has been applied. She will provide a short overview of the key underlying technologies: Spark NLP for Healthcare, BERT embeddings and healthcare-specific embeddings. Then, she will describe how these were applied to tackle the challenges of a healthcare setting: understanding clinical terminology, extracting specialty-specific facts of interest, and using transfer learning to minimize the required amount of task-specific annotation. The use of MLflow and its integration with Spark NLP to track experiments and reproduce results will also be covered.

In the second part of the talk, Dr. Sharma will cover automated deep learning, the system’s ability to train, tune and measure models once clinical annotators add or correct labeled data. She will cover the annotation process and guidelines; why automation was required to handle the variety in clinical language across providers, document types, and geographies; and how this works in practice. She will also share how Roche applies Spark NLP for healthcare to extract clinical knowledge from pathology, radiology, and genomics reports.

About the speaker

Vishakha Sharma

Principal Data Scientist at Roche

Vishakha Sharma is a Principal Data Scientist in Roche diagnostics information solutions, where she leads advanced analytics initiatives such as natural language processing (NLP) and machine learning (ML) to discover key insights improving NAVIFY product portfolio, leading to better and more efficient patient care. Vishakha has authored 40+ peer-reviewed publications and proceedings and has given 15+ invited talks. She serves in top international scientific and technical program committees, and panels in AI/ML/NLP conferences (including NeurIPS, ICLR, AMIA). Her research work has been funded by the NIH Big Data to Knowledge (BD2K) initiative to build an NLP precision medicine software. Vishakha is a senior member of the Association for Computing Machinery (ACM) and Institute of Electrical and Electronics Engineers (IEEE). She holds a Ph.D. in computer science.



Sessions: October 4 – 6
Trainings: October 11 – 14


Presented by