We use cookies to ensure our website operates correctly and to monitor visits to our site. This helps us to improve the way our website works, ensuring that users easily find what they are looking for. To allow us to keep doing this, click 'Accept All Cookies'. Alternatively, you can personalise your cookie settings.

Accept All Cookies Personalise settings


Bringing context into our conversations with ‘Intelligent’ Digital Assistants


An Intelligent Digital or Virtual Assistant is an AI-based application that understands natural language voice (or text) commands and completes tasks for the user. You may be aware of Apple's Siri, Google Assistant, Amazon Alexa and Microsoft Cortana to name a few, for at the current time there exist a large number of these software entities with different levels of sophistication and capability. Yet even the most ‘intelligent’ leave much to be desired, since they lack contextual awareness and as a result cannot adapt to user behaviour and their environment. 

Digital Assistant main - conceptual image showing human interacting with robot with a screen in between

Now, the US multinational technology conglomerate known as Meta (formerly Facebook) are attempting to rectify this situation. In February 2022, Meta showcased a new neural model for digital assistants called Project CAIRaoke, which they claim will be capable of having much better contextual conversations.

Project CAIRaoke combines four of the AI speech models currently used by digital assistants to enable their assistant technology to better understand context and have the ability to recognise different phrases that are used to say the same thing. Meta claims this approach enables a more natural and flowing conversation. They see CAIRaoke as a major component of the ‘Metaverse’ - a next generation internet that is more interactive and immersive through Virtual and Augmented Reality technologies.

Meta’s vision for digital assistants is by no means new. It has long been envisaged that intelligent digital assistants would one day become more capable of maintaining the context of natural language conversation for long periods – enabling their rationales and explanations to be more understandable to humans.

Estimated time to maturity: 0 to 2 years

Source: Lifewire