By using this site, you agree to our Privacy Policy and Terms of Service.
Accept
Absolute GeeksAbsolute Geeks
  • CTRL+R
    • TECH
    • GAMING
    • AUTOMOTIVE
    • GUIDES
    • OPINIONS
  • JEDI TESTED
    • SMARTPHONES
    • HEADPHONES
    • ACCESSORIES
    • LAPTOPS
    • SPEAKERS
    • TABLETS
    • WEARABLES
    • APPS
    • GAMING
    • AUTOMOTIVE
    • TV & MOVIES
    • ━
    • READERS’ CHOICE
    • ALL REVIEWS
  • WATCHLIST
    • TV & MOVIES REVIEWS
    • REEL WIRE
  • +
    • TMT LABS
    • WHO WE ARE
    • GET IN TOUCH
Reading: DeepSeek debuts Sparse Attention model to cut inference costs in half
Share
Absolute GeeksAbsolute Geeks
  • CTRL+R
    • TECH
    • GAMING
    • AUTOMOTIVE
    • GUIDES
    • OPINIONS
  • JEDI TESTED
    • SMARTPHONES
    • HEADPHONES
    • ACCESSORIES
    • LAPTOPS
    • SPEAKERS
    • TABLETS
    • WEARABLES
    • APPS
    • GAMING
    • AUTOMOTIVE
    • TV & MOVIES
    • ━
    • READERS’ CHOICE
    • ALL REVIEWS
  • WATCHLIST
    • TV & MOVIES REVIEWS
    • REEL WIRE
  • +
    • TMT LABS
    • WHO WE ARE
    • GET IN TOUCH
Follow US

DeepSeek debuts Sparse Attention model to cut inference costs in half

GEEK STAFF
GEEK STAFF
Sep 30, 2025

DeepSeek has unveiled a new experimental AI model, V3.2-exp, that uses a technique called Sparse Attention to reduce inference costs, particularly for long-context operations. Announced via Hugging Face and accompanied by an academic paper on GitHub, the model introduces a pair of mechanisms designed to streamline how transformer models process large amounts of text.

At the core of the system is what DeepSeek calls a “lightning indexer,” which scans the full context window and identifies the most relevant excerpts. A second component, the “fine-grained token selection system,” then narrows down specific tokens within those excerpts for the model to process. By focusing computational resources only on the most meaningful sections, the model can handle long-context workloads without the same server strain.

Early tests suggest this approach could cut the cost of API calls by up to half in long-context scenarios, though the company notes that further third-party evaluations are needed to validate the results. Because the model is open-weight and freely available, researchers and developers will be able to benchmark its performance independently in the coming weeks.

The release highlights a growing push across the AI industry to address inference costs — the ongoing expense of running large models once they’ve been trained. Unlike training, which is a one-time cost, inference requires continuous server power for every query and response, making efficiency crucial for commercial viability. Sparse Attention represents DeepSeek’s attempt to re-engineer parts of the transformer architecture to make it leaner.

DeepSeek, based in China, has positioned itself as a cost-conscious AI developer in a market dominated by U.S. firms. Earlier this year, its R1 model drew attention for being trained with reinforcement learning at a fraction of the cost of American rivals, though it fell short of sparking the disruption some predicted. While the V3.2-exp release is less likely to generate headlines on the same scale, it may have a more immediate impact by offering practical tools to reduce operating expenses for long-context AI applications.

Share
What do you think?
Happy0
Sad0
Love0
Surprise0
Cry0
Angry0
Dead0

WHAT'S HOT ❰

Ring introduces first 4k cameras and doorbells with AI-powered enhancements
Echo Dot Max, Studio, and new Echo Show models debut with updated design
Spotify CEO Daniel Ek to step down, transition to executive chairman role
Nothing launches Playground: an AI-powered app store for user-made widgets
Jumbo launches Find Your Groove campaign with up to 50% off audio gear
Absolute GeeksAbsolute Geeks
Follow US
© 2014-2025 Absolute Geeks, a TMT Labs L.L.C-FZ media network - Privacy Policy
Ctrl+Alt+Del inbox boredom
Smart reads for sharp geeks - subscribe to our newsletter and stay updated
No spam, just RAM for your brain.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?