Data Augmentation for Sequence Labeling. A Case Study in Food Parsing

Deep Learning supervised models are ubiquitously sought in the tech industry, where data and classification use cases abound.

While established tech companies have access to enough resources and data to employ state-of-the-art, transformer-based deep neural models, start-ups who struggle with data acquisition are forced to rely on classical machine learning models for their data processing needs.

In this talk, I will be discussing data augmentation techniques that I used in my NLP project for Lark, an acclaimed AI healthcare start-up, to enable their virtual nurse app to unlock the power of deep learning in a use case that is often overlooked in the industry: sequence labeling.

About the speaker
octavia nlp

Octavia Sulea 

Machine Learning Engineer at FlexJobs

With 4 years working as a Data Scientist and ML Engineer in the US tech industry and 10 years working in multiple NLP research labs, Octavia Sulea is an NLP expert with an intricate background.

They hold a Ph.D. in CS, with a thesis on linguistic alternation in NLP, an MSc in AI, an MA in Linguistics, and a BSc in CS.

They have written 21 publications, the majority of which came out before she started her Ph.D., garnered over 200 citations including in a Google patent, created a “classical model” in legal NLP that became illegal in France, and, last year, they contributed to the creation of the first astrology-based dating app, Struck, which was featured in Tech Crunch and LA Times. They are also queer.



Sessions: October 5 – 7
Trainings: October 4, 12 – 15


Presented by