By using this site, you agree to our Privacy Policy and Terms of Service.
Accept
Absolute Geeks UAEAbsolute Geeks UAE
  • STORIES
    • TECH
    • AUTOMOTIVE
    • GUIDES
    • OPINIONS
  • REVIEWS
    • READERS’ CHOICE
    • ALL REVIEWS
    • ━
    • SMARTPHONES
    • CARS
    • HEADPHONES
    • ACCESSORIES
    • LAPTOPS
    • TABLETS
    • WEARABLES
    • SPEAKERS
    • APPS
  • WATCHLIST
    • TV & MOVIES REVIEWS
    • SPOTLIGHT
  • GAMING
    • GAMING NEWS
    • GAME REVIEWS
  • +
    • OUR STORY
    • GET IN TOUCH
Reading: DeepSeek debuts Sparse Attention model to cut inference costs in half
Share
Notification Show More
Absolute Geeks UAEAbsolute Geeks UAE
  • STORIES
    • TECH
    • AUTOMOTIVE
    • GUIDES
    • OPINIONS
  • REVIEWS
    • READERS’ CHOICE
    • ALL REVIEWS
    • ━
    • SMARTPHONES
    • CARS
    • HEADPHONES
    • ACCESSORIES
    • LAPTOPS
    • TABLETS
    • WEARABLES
    • SPEAKERS
    • APPS
  • WATCHLIST
    • TV & MOVIES REVIEWS
    • SPOTLIGHT
  • GAMING
    • GAMING NEWS
    • GAME REVIEWS
  • +
    • OUR STORY
    • GET IN TOUCH
Follow US

DeepSeek debuts Sparse Attention model to cut inference costs in half

GEEK DESK
GEEK DESK
Sep 30

DeepSeek has unveiled a new experimental AI model, V3.2-exp, that uses a technique called Sparse Attention to reduce inference costs, particularly for long-context operations. Announced via Hugging Face and accompanied by an academic paper on GitHub, the model introduces a pair of mechanisms designed to streamline how transformer models process large amounts of text.

At the core of the system is what DeepSeek calls a “lightning indexer,” which scans the full context window and identifies the most relevant excerpts. A second component, the “fine-grained token selection system,” then narrows down specific tokens within those excerpts for the model to process. By focusing computational resources only on the most meaningful sections, the model can handle long-context workloads without the same server strain.

Early tests suggest this approach could cut the cost of API calls by up to half in long-context scenarios, though the company notes that further third-party evaluations are needed to validate the results. Because the model is open-weight and freely available, researchers and developers will be able to benchmark its performance independently in the coming weeks.

The release highlights a growing push across the AI industry to address inference costs — the ongoing expense of running large models once they’ve been trained. Unlike training, which is a one-time cost, inference requires continuous server power for every query and response, making efficiency crucial for commercial viability. Sparse Attention represents DeepSeek’s attempt to re-engineer parts of the transformer architecture to make it leaner.

DeepSeek, based in China, has positioned itself as a cost-conscious AI developer in a market dominated by U.S. firms. Earlier this year, its R1 model drew attention for being trained with reinforcement learning at a fraction of the cost of American rivals, though it fell short of sparking the disruption some predicted. While the V3.2-exp release is less likely to generate headlines on the same scale, it may have a more immediate impact by offering practical tools to reduce operating expenses for long-context AI applications.

Share
What do you think?
Happy0
Sad0
Love0
Surprise0
Cry0
Angry0
Dead0

WHAT'S HOT ❰

UAE dirham symbol approved for unicode, arriving on keyboards in 2026
Paramount to acquire Warner Bros Discovery in $110 billion media merger
ChatGPT nears 1 billion weekly users as OpenAI reports 900m milestone
Nintendo reveals $70 Game Boy Jukebox for Pokémon’s 30th anniversary
Nothing Headphone (a) confirmed for March 5 with bold yellow finish
Absolute Geeks UAEAbsolute Geeks UAE
Follow US
AbsoluteGeeks.com was assembled by Absolute Geeks Media FZE LLC during a caffeine incident.
© 2014–2026. All rights reserved.
Proudly made in Dubai, UAE ❤️
Upgrade Your Brain Firmware
Receive updates, patches, and jokes you’ll pretend you understood.
No spam, just RAM for your brain.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?