Automating Annotation Process Using Rule-Based Algorithm

In this talk, Priya will explain how to automate an annotation process using rule-based algorithm and thus eliminating the manual labor of annotating texts. The process will be elaborated by explaining the rule-based approach which is used to annotate medical notes of patients. The rules are extracted based on the model trained on limited training and validation set.

The talk will be focused on annotating medical notes for patients experiencing a particular symptom. The rule-based algorithm extracts knowledge in the form of rules from the classification model, which are easy to comprehend and very expressive

This algorithm annotates each note as having or not having a particular symptom, thus making it easier for annotators with non-medical background to annotate medical records.

This approach is not only helpful for annotators with non-medical background but also a no cost method to annotate notes.

About the speaker

Priya Shaji

Data Scientist at Memorial Sloan Kettering Cancer Center

Priya Shaji works as a Data Scientist in the Strategy and Innovation Department at MSKCC. She works on NLP and ML Projects which involves analyzing medical datasets , creating reports, engineering NLP pipelines and contributing towards the Applied Data Science team’s goals.

She has completed her Master’s in Data Science from CUNY School of Professional Studies, New York, where she worked on projects consisting of developing data models and visualizations. Prior to this, she worked at NYC Department of Information Technology and Telecommunications where she contributed to data science and engineering projects of various domains.



Sessions: October 4 – 6
Trainings: October 11 – 14


Presented by