Large Language Models to Facilitate Building of Cancer Data Registries

This talk is aimed at healthcare practitioners and healthcare data teams who are developing registries in the field of cancer. I will share experiences of my team evaluating the relative performance of large language models against previous NLP methods for cancer disease response classification ( and sites of metastases (manuscript under review JCO CCI). These will include the libraries, tools we used such as John Snow Labs’ Healthcare NLP & LLM libraries, the effects of data augmentation techniques and prompt-based fine-tuning, as well as challenges we faced. This will be useful to healthcare data registry teams which are lean and low-resource, looking for low-cost and sustainable solutions.

The talk will also cover some of our ongoing work with generative AI models for clinical trial matching and a discussion on some healthcare use cases that we think are of interest to the clinical and digital healthcare community. I will also briefly discuss the general potential and challenges around foundation models for generalist medical AI.

About the speaker

Ryan Tan

Consultant at National Cancer Centre Singapore

Dr. Ryan Tan is a Consultant in the Division of Medical Oncology at National Cancer Centre Singapore and fellow at the Memorial Sloan Kettering Cancer Center. Dr Tan’s sub-specialty interests are breast cancer and gynaecological cancer. His research interests include Informatics and Artificial Intelligence and their application in healthcare. His research team collaborates with various academic groups and industry partners to develop useful analytic platforms and tools for use by clinicians and researchers.



Sessions: April 2nd – 3rd 2024
Trainings: April 15th – 19th 2024



Presented by