By using this site, you agree to our Privacy Policy and Terms of Service.
Accept
Absolute Geeks UAEAbsolute Geeks UAE
  • STORIES
    • TECH
    • AUTOMOTIVE
    • GUIDES
    • OPINIONS
  • REVIEWS
    • READERS’ CHOICE
    • ALL REVIEWS
    • ━
    • SMARTPHONES
    • CARS
    • HEADPHONES
    • ACCESSORIES
    • LAPTOPS
    • TABLETS
    • WEARABLES
    • SPEAKERS
    • APPS
  • WATCHLIST
    • TV & MOVIES REVIEWS
    • SPOTLIGHT
  • GAMING
    • GAMING NEWS
    • GAME REVIEWS
  • +
    • OUR STORY
    • GET IN TOUCH
Reading: Meta shrinks Llama AI for smartphones and low-powered devices
Share
Notification Show More
Absolute Geeks UAEAbsolute Geeks UAE
  • STORIES
    • TECH
    • AUTOMOTIVE
    • GUIDES
    • OPINIONS
  • REVIEWS
    • READERS’ CHOICE
    • ALL REVIEWS
    • ━
    • SMARTPHONES
    • CARS
    • HEADPHONES
    • ACCESSORIES
    • LAPTOPS
    • TABLETS
    • WEARABLES
    • SPEAKERS
    • APPS
  • WATCHLIST
    • TV & MOVIES REVIEWS
    • SPOTLIGHT
  • GAMING
    • GAMING NEWS
    • GAME REVIEWS
  • +
    • OUR STORY
    • GET IN TOUCH
Follow US

Meta shrinks Llama AI for smartphones and low-powered devices

GEEK DESK
GEEK DESK
Oct 25

Meta Platforms is making its Llama 3.2 large language models even more accessible by introducing “quantized” versions specifically optimized for smartphones and other devices with limited processing power. This move aims to bring the power of generative AI to a wider range of hardware, opening up new possibilities for on-device AI applications.

The Need for Smaller AI Models

Large language models (LLMs) like Llama are typically resource-intensive, requiring significant computing power and memory to function. This can be a barrier to deploying them on devices with limited resources, such as smartphones and embedded systems. Quantized models address this challenge by reducing the model size and computational demands without sacrificing too much performance.

Quantization: Shrinking AI for Portability

Quantization is a technique that reduces the precision of the numerical values used to represent the model’s parameters. This shrinks the overall size of the model and allows for faster processing, making it suitable for devices with less memory and processing power.

Meta employed two quantization methods for its Llama 3.2 1B and 3B models:

  • QLoRA: This method prioritizes accuracy, ensuring the quantized model performs as closely as possible to the original, even with reduced precision.
  • SpinQuant: This method prioritizes portability, allowing for even greater model compression at the potential cost of some accuracy.

Performance on Low-Powered Devices

Meta’s testing showed that the quantized Llama models achieved an average model size reduction of 56% and a two- to four-times speedup in inference processing. On Android smartphones, the models used 41% less memory while maintaining performance comparable to the full-sized versions.

Partnerships and Optimizations

Meta collaborated with Qualcomm and MediaTek to optimize the quantized Llama models for their Arm-based mobile chips. The company also utilized Kleidi AI kernels to enhance performance on mobile CPUs. These optimizations enable developers to create AI experiences that run directly on users’ devices, enhancing privacy and responsiveness.

Expanding the Reach of Generative AI

The release of these quantized Llama models is part of Meta’s broader push to democratize access to generative AI. By enabling these models to run on a wider range of devices, Meta is empowering developers to create innovative AI applications for various platforms and use cases. This move could lead to a surge in AI-powered features on smartphones and other everyday devices, bringing the capabilities of LLMs to a wider audience.

Share
What do you think?
Happy0
Sad0
Love0
Surprise0
Cry0
Angry0
Dead0

WHAT'S HOT ❰

OpenAI requires macOS app updates after Axios library security issue
Gmail end-to-end encryption arrives on iOS for workspace users
Meta strengthens Instagram teen account safeguards in UAE and Saudi Arabia
Samsung Galaxy A57 5G and A37 5G launch in UAE
Shark PowerdDetect UV Reveal robot vacuum mop brings stain detection to hard floors
Absolute Geeks UAEAbsolute Geeks UAE
Follow US
AbsoluteGeeks.com was assembled by Absolute Geeks Media FZE LLC during a caffeine incident.
© 2014–2026. All rights reserved.
Proudly made in Dubai, UAE ❤️
Upgrade Your Brain Firmware
Receive updates, patches, and jokes you’ll pretend you understood.
No spam, just RAM for your brain.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?