Identifying Housing Insecurity and Other Social Determinants of Health from Free-Text Notes

Cityblock Health is a community-centered healthcare company that focuses on delivering personalized medical and social care to low-income neighborhoods. In addition to providing care in a clinical setting, we improve outcomes by addressing social determinants of health. SDoH are non-clinical, environmental factors that influence access to healthcare resources, the likelihood of developing chronic diseases, and overall clinical outcomes for individuals and communities. The SDoH that impact patient care are often under-represented, in part because they are primarily captured within free-text unstructured data sources such as clinical notes.

A major driver of health outcomes is access to stable housing, so we built a housing classification model to identify members in need of housing and enable our Community Health Partners to work with our members to address that need. We used Spark NLP for Healthcare to process raw text data into sentence-level embeddings; used active learning to manually annotate our data; and deep learning with TensorFlow to build the classification model.

In this talk, we will address some of the nuances of implementing our model, present our findings, and discuss how we are using our model results to help our members.

About the speaker
Yanshan Wang

Eric Hilton

Senior Data Scientist at Cityblock Health

Eric Hilton is an astronomer turned senior data scientist at Cityblock Health, where he works on the Data Science Foundations team.

He has experience in a range of techniques and domains including Monte Carlo simulations, NLP, machine learning, as well as MLOps.

He is excited to be at Cityblock, where he is able to work on interesting technical problems that have a positive impact on people’s health. He lives and plays in beautiful Whitefish, MT.

Yanshan Wang

Caitlin Dugan

Data Scientist at Cityblock Health

Caitlin Dugan is a data scientist at Cityblock Health with an academic background in epidemiology and biostatistics.

She has a broad scope of experience across the healthcare industry and has used data to drive strategy at state health departments, health plans, and large hospital systems before venturing into the Healthtech startup space. She is passionate about transforming patient care and improving health outcomes with data science.



Sessions: April 5th – 6th 2022
Trainings: April 12th – 15th 2022


Presented by