Records are fundamental building blocks of AI systems, and it is unlikely that this paradigm will ever change. Without a corpus that people can rely on, the models can not learn the relationships that support their predictions.
But why stay with a single body? A fascinating report by ABI Research estimates that the installed base of AI devices will rise from 2.69 billion in 2019 to 4.47 billion in 2024, while comparatively few will be interoperable in the short term. Instead of combining the gigabyte petabyte data flowing through them into a single AI model or framework, they will work independently and heterogeneously to make sense of the data they feed.
That's unfortunate, argues ABI findings that can be won if they work well together. Therefore, as an alternative to this unimodality, the research firm proposes multimodal learning, which combines data from different sensors and inputs into a single system.
Multimodal learning can include complementary information or trends that often become visible only when they are present in the learning process. In addition, learning-based methods that use signals from different modalities can generate more robust conclusions than would be possible in a unimodal system.
View pictures and text captions. When mating different words with similar images, these words are likely to be used to describe the same things or objects. Conversely, if some words appear next to different images, it means that these images represent the same object. In view of this, an AI model should be able to predict image objects from textual descriptions, and indeed, a number of academic literature has proven that to be true.
Despite the Many Benefits of Multimodal Machine Learning Approaches, ABIs are The report finds that most platform companies ̵
Fortunately, there is still hope for a broad multimodal takeover. ABI Research expects the total number of devices shipped to increase from 3.94 million in 2017 to 514.12 million in 2023, reflecting the acquisition in robotics, consumer, healthcare, and media and entertainment. Companies like Waymo use multimodal approaches to build hyperconscious self-driving vehicles, while teams like Omesh Tickoo, the chief engineer of Intel Lab, are studying the process of collecting sensor data in real-world environments.
"In a noisy scenario You may not be able to retrieve much information with your audio sensors, but if the lighting is good, you may be able to get a little better informed with a camera," explained Tickoo VentureBeat in a telephone interview. "We used techniques to determine the context, such as the time of day, to create a system that will tell you when the data from a sensor is not of the highest quality. Given this confidence, different sensors are balanced against each other at different intervals, and the right mixture is selected to obtain the response sought. "
Multimodal learning does not necessarily replace unimodal learning – unimodal learning is very effective in applications such as image recognition and natural language processing. But as electronics become cheaper and more scalable, they will probably become more important.
"Classification, decision making and HMI systems will play a key role in introducing multimodal learning and will be a catalyst in refining and standardizing some of the technical approaches," said Stuart Carlaw, ABI Research's Chief Research Officer. in a statement. "There are impressive impulses for integrating multimodal applications into devices."
For information about AI, send messages to Khari Johnson and Kyle Wiggers – and be sure to subscribe to the weekly AI newsletter and bookmark our AI channel.
Senior AI Staff Writer
PS Enjoy this video in which Bill Gates talks about AI at the Bloomberg New Economy Forum in Beijing, including on climate change and nuclear energy.