A Pretrained Language Model for Public Health Surveillance on Social Media

A user-generated text on social media enables health workers to keep track of information, identify possible outbreaks, forecast disease trends, monitor emergency cases, and ascertain disease awareness and response to official health correspondence. This exchange of health information on social media has been regarded as an attempt to enhance public health surveillance (PHS).

Advancements in pretrained language models (PLMs) have facilitated the development of several domain-specific PLMs and a variety of downstream applications. However, there are no PLMs for social media tasks involving PHS.

In this talk, I will introduce PHS-BERT, a transformer-based PLM, to identify tasks related to public health surveillance on social media. Compared with existing PLMs that are mainly evaluated on limited tasks, PHS-BERT achieved state-of-the-art performance on all 25 tested datasets, showing that PHS-BERT is robust and generalizable in the common PHS tasks.

About the speaker

Usman Naseem

PhD Candidate at University of Sydney, Australia

Usman Naseem is a PhD candidate at the School of Computer Science, The University of Sydney, Australia. Usman obtained his Masters in Analytics (Research) from the School of Computer Science, University of Technology Sydney, Australia, in 2020. Before joining academia, he worked in various roles in leading ICT companies like Alcatel-Lucent and Nokia for 9+ years.

His primary research is in the intersection of machine learning and natural language processing for social media analytics and biomedical/health informatics.



Sessions: October 4 – 6
Trainings: October 11 – 14



Presented by