Cleanlab: Making AI Work with Messy, Real-World Healthcare and NLP Data

I’ll start the talk with an overview of cleanlab 2.0, a powerful open-source package that lets you find and fix label errors and data quality issues in *any* labeled dataset… in just a few lines of code.

Next, I’ll share a (very brief) overview of confident learning, the underlying field of theory and algorithms that make Cleanlab work under the hood.

Finally, I’ll jump into four concrete examples of how cleanlab makes AI solutions work for the NLP + Healthcare communities. Namely, (1) how companies like Amazon, Google, Tesla, and Wells Fargo used Cleanlab technology, (2) automatically correcting insurance healthcare codes, (3) real-time assistance during diagnosis by flagging when a medical health report might need review by another doctor, and (4) how cleanlab open-source was used to enable a startup to understand human emotion toward supporting patient mental health AI solutions.

About the speaker
Yanshan Wang

Curtis Northcutt

CEO & Co-Founder at Cleanlab Inc

Curtis Northcutt is an American computer scientist and entrepreneur focusing on machine learning and AI to empower people.

He is the CEO and Co-founder of Cleanlab, an open-source data quality company that makes AI work with real-world, messy data. Curtis completed his PhD at MIT where he invented Cleanlab’s algorithms for automatically finding and fixing label issues in any dataset. He is the recipient of the MIT Morris Levin Thesis Award, the NSF Fellowship, and the Goldwater Scholarship and has worked at several leading AI research groups, including Google, Oculus, Amazon, Facebook, Microsoft, and NASA.



Sessions: April 5th – 6th 2022
Trainings: April 12th – 15th 2022


Presented by