Apple unveils a new AI system that could revolutionize voice assistants

Apple researchers have introduced ReALM (Reference Resolution as Language Modeling), an AI system designed to significantly improve how voice assistants understand and respond to commands.
In a research paper, Apple describes the new system as how large language models solve a reference resolution problem that involves deciphering ambiguous references to on-screen objects, as well as understanding conversational and background context.
Reference resolution, a key aspect of natural language understanding, has historically posed challenges for digital assistants. ReALM tackles this by transforming the complex process into a language modeling task, allowing it to decipher on-screen elements and integrate this knowledge seamlessly into conversations.
By reconstructing the visual layout of a screen using textual representations, ReALM surpasses traditional methods and even outperforms OpenAI's GPT-4. This breakthrough could lead to more intuitive interactions with voice assistants, particularly in scenarios like driving or assisting users with disabilities.