“Say it to Play it” – Spoken Language Understanding Challenges in Entertainment Domain
As voice-enabled systems are gaining more and more popularity, the success of these systems relies not only on the correct recognition of what the user said but also on what the user meant.
Spoken Language Understanding (SLU) focuses on interpreting these user’s intentions from their speech utterances. SLU as compared to traditional text Natural Language Understanding (NLU) systems has its own challenges since spoken language is noisier than written language. Moreover, when SLU is applied to the task of understanding command-style short utterances having low context, it adds to the complexity.
In addition, there are challenges related to the Entertainment domain, where entities such as movies, music artists, music albums can have unique creative names often involving wordplays. The constant and rapid release of new content in this domain and their overlapping names across different entity types, such as movies and music albums, pose additional challenges.
The session tries to highlight some of these key challenges and some of the state-of-the-art approaches used to generally address these.
Nidhi Rajshree leads the Natural Language Understanding (NLU) research and development effort for Roku’s Voice Assistant that enables users to issue voice commands to their Roku devices.
The assistant is built using novel and state-of-the-art techniques in the space of Spoken NLU and Semantic Technologies. Prior to this at IBM Research and Watson, Nidhi led several innovations and research efforts in NLU and related fields applied to Expert-Assist systems.
These systems assisted experts in domains such as Life Sciences, Intelligence, Education, and Public Services, in achieving complex cognitive tasks such as decision-making and knowledge discovery. Her work was internationally recognized in many forms including being published and showcased in premier conferences, and patent grants.