Qualcomm brings on-device AI to mobile and PC
Qualcomm is no stranger in running artificial intelligence and machine learning systems on-device and without an internet connection. They’ve been doing it with their camera chipsets for years. But on Tuesday at Snapdragon Summit 2023, the company announced that on-device AI is finally coming to mobile devices and Windows 11 PCs as part of the new Snapdragon 8 Gen 3 and X Elite chips.
Both chipsets were built from the ground up with generative AI capabilities in mind and are able to support a variety of large language models (LLM), language vision models (LVM), and transformer network-based automatic speech recognition (ASR) models, up to 10 billion parameters for the SD8 gen 3 and 13 billion parameters for the X Elite, entirely on-device. That means you’ll be able to run anything from Baidu’s ERNIE 3.5 to OpenAI’s Whisper, Meta's Llama 2 or Google’s Gecko on your phone or laptop, without an internet connection. Qualcomm’s chips are optimized for voice, text and image inputs.
“It's important to have a wide array of support underneath the hood for these models to be running and therefore heterogeneous compute is extremely important,” Durga Malladi, SVP & General Manager, Technology Planning & Edge Solutions at Qualcomm, told reporters at a prebriefing last week. “We have state-of-the-art CPU, GPU, and NPU (Neural Processing Unit) processors that are used concurrently, as multiple models are running at any given point in time.”
The Qualcomm AI Engine is comprised of the Oryon CPU, the Adreno GPU and Hexagon NPU. Combined, they handle up to 45 TOPS (trillions of operations per second) and can crunch 30 tokens per second on laptops, 20 tokens per second on mobile devices — tokens being the basic text/data unit that LLMs can process/generate off of. The chipsets use Samsung’s 4.8GHz LP-DDR5x DRAM for their memory allocation.
“Generative AI has demonstrated the ability to take very complex tasks, solve them and resolve them in a very efficient manner,” he continued. Potential use cases could include meeting and document summarization or email drafting for consumers, and prompt-based computer code or music generation for enterprise applications, Malladi noted.
Or you could just use it to take pretty pictures. Qualcomm is integrating its previous work with edge AI, Cognitive ISP. Devices using these chipsets will be able to edit photos in real-time and in as many as 12 layers. They'll also be able to capture clearer images in low light, remove unwanted objects from photos (a la Google’s Magic Eraser) or expand image backgrounds. User scan even watermark their shots as being real and not AI generated, using Truepic photo capture.
Having an AI that lives primarily on your phone or mobile device, rather than in the cloud, will offer users myriad benefits over the current system. Much like enterprise AIs that take a general model (e.g. GPT-4) and tune it using a company’s internal data to provide more accurate and on-topic answers, a locally-stored AI “over time… gradually get personalized,” Malladi said, “in the sense that… the assistant gets smarter and better, running on the device in itself.”
What’s more, the inherent delay present when the model has to query the cloud for processing or information doesn’t exist when all of the assets are local. As such, both the X Elite and SD8 gen 3 are capable of not only running Stable Diffusion on-device but generating images in less than 0.6 seconds.
The capacity to run bigger and more capable models, and interact with those models using our speaking words instead of our typing words, could ultimately prove the biggest boon to consumers. “There's a very unique way in which we start interfacing the devices and voice becomes a far more natural interface towards these devices — as well in addition to everything else,” Malladi said. “We believe that it has the potential to be a transformative moment, where we start interacting with devices in a very different way compared to what we've done before.”
Mobile devices and PCs are just the start for Qualcomm’s on-device AI plans. The 10-13 billion parameter limit is already moving towards 20 billion-plus parameters as the company develops new chip iterations. “These are very sophisticated models,” Malladi commented. “The use cases that you build on this are quite impressive.”
“When you start thinking about ADAS (Advanced Driver Assistance Systems) and you have multi-modality [data] coming in from multiple cameras, IR sensors, radar, lidar — in addition to voice, which is the human that is inside the vehicle in itself,” he continued. “The size of that model is pretty large, we're talking about 30 to 60 billion parameters already.” Eventually, these on-device models could approach 100 billion parameters or more, according to Qualcomm’s estimates.