Getting your TV to understand you better

Wednesday, September 5, 2018

New research out of the University of 蓝莓视频 has found a way to improve the voice query understanding capabilities of home entertainment platforms.

The research, in collaboration with the University of Maryland and Comcast Applied AI Research Lab, uses artificial intelligence (AI) technology to achieve the most natural speech-based interactions with TVs to date.听

鈥淭oday, we have become accustomed to talking to intelligent agents that do our bidding 鈥 from Siri on a mobile phone to Alexa at home. Why shouldn鈥檛 we be able to do the same with TVs?鈥 asked Jimmy Lin, a professor at the University of 蓝莓视频 and David R. Cheriton Chair in the David R. Cheriton School of Computer Science.听

鈥淐omcast鈥檚 Xfinity X1 aims to do exactly that 鈥 the platform comes with a 鈥榲oice remote鈥 that accepts spoken queries. Your wish is its command 鈥 tell your TV to change channels, ask it about free kids鈥 movies, and even about the weather forecast.鈥

In tackling the complex problem of understanding voice queries, the researchers had the idea to take advantage of the latest AI technology 鈥 a technique known as hierarchical recurrent neural networks 鈥 to better model context and improve the system鈥檚 accuracy.听

In January 2018, the researchers鈥 new neural network model was deployed in production to answer queries from real live users. Unlike the previous system, which was confused by approximately eight per cent of queries, the new model handles most of the very complicated queries appropriately, greatly enhancing user experience.

鈥淚f a viewer asks for 鈥楥hicago Fire,鈥 which refers to both a drama series and a soccer team, the system is able to decipher what you really want,鈥 said Lin. 鈥淲hat鈥檚 special about this approach is that we take advantage of context 鈥 such as previously watched shows and favourite channels 鈥 to personalize results, thereby increasing accuracy.鈥

The researchers have started work on developing an even richer model. The intuition is that by analyzing queries from multiple perspectives, the system can better understand what the viewer is saying.听

The paper, Multi-Task Learning with Neural Networks for Voice Query Understanding Entertainment Platform, was presented at the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining heldrecentlyin the United Kingdom. The research was undertaken by Jinfeng Rao, a PhD graduate from the University of Maryland, his advisor Lin, and mentor Ferhan Ture, a researcher at Comcast Applied AI Research Lab.

MEDIA CONTACT |Ryon Jones
519-888-4567 ext. 30031 ||

Attention broadcasters: 蓝莓视频 has facilities to provide broadcast quality audio and video feeds with a double-ender studio. Please contact us for more information.