Dnotitia Unveils STAR-KV Compression Method Selected as ICML 2026 Spotlight Paper

July 02, 2026

Dnotitia introduced STAR-KV, a cache compression method that reduces key-value cache size by up to 20 times and speeds up generation throughput by over three times. The research, developed with UC San Diego, has been recognized as a Spotlight paper at ICML 2026 in Seoul.

Dnotitia Inc. has released the paper and source code for STAR-KV, a key value cache compression method that achieves up to 20 times reduction in cache size and faster inference speeds, announced in a press release. The research was conducted in collaboration with UC San Diego's VVIP Lab.

The paper, titled "STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control," was selected as a Spotlight paper at ICML 2026, which will be held in Seoul from July 6 to 11. Out of 23,918 reviewed papers, 536 received this recognition.

According to Dnotitia, STAR-KV reduces key value cache size by up to 75 percent using low-rank compression, and by up to 20 times when combined with a mixed precision quantization method. The technique also uses custom GPU kernels to improve attention computation speed by up to 6.9 times and overall generation throughput by up to 3.1 times.

Dnotitia plans to further develop STAR-KV for AI service environments and adapt it to open source large language model inference frameworks such as vLLM.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Silicon Brief or Daily AI Brief.

Also, consider following us on social media:

AI Chips & Datacenters AI Brief AI Brief (X)

More from: Data Centers

07/02

DXC Technology Launches Private Cloud+ for Regulated Enterprise Workloads

07/01

Silicom Receives First Production Order for AI Inference Solution

07/01

Axiom Cloud Introduces AI Insights Agent for Refrigeration and HVAC Operators

07/01

ASRC Federal Develops NASA’s Athena Supercomputer

07/01

National Grid Ventures to Invest $1.75 Billion in Joulent for US Data Center Power Projects

Subscribe to Silicon Brief

Weekly coverage of AI hardware developments including chips, GPUs, cloud platforms, and data center technology.

Whitepaper

Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation

The 2025 AI Index by Stanford HAI provides a comprehensive overview of the global state of artificial intelligence, highlighting significant advancements in AI capabilities, investment, and regulation. The report details improvements in AI performance, increased adoption in various sectors, and the growing global optimism towards AI, despite ongoing challenges in reasoning and trust. It serves as a critical resource for policymakers, researchers, and industry leaders to understand AI's rapid evolution and its implications.

Categories

Companies

Resources

Dnotitia Unveils STAR-KV Compression Method Selected as ICML 2026 Spotlight Paper

We hope you enjoyed this article.

More from: Data Centers

Subscribe to Silicon Brief

Whitepaper

Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation

You May Also Like

DapuStor Highlights AI-Optimized SSD Portfolio at Computex 2026

Verkada Partners with NVIDIA to Advance Physical AI Platform

NTT Introduces Rationale-Enhanced Decoding for Explainable AI Inference

VIDIZMO Highlights Local Control as Enterprise AI Platforms Turn to Foreign Models

SAIHEAT Expands into AI Inference Services for Enterprises

OpenAI and Broadcom Reveal Jalapeño Inference Chip for LLMs

Neurometric Launches Automated Token Engineering Platform and Raises $4 Million

VIDIZMO Launches AI Intelligence Hub for Multimodal Data Analysis

Duality Technologies and Red Hat Partner on Trusted AI for Regulated Data Environments

Google Introduces DiffusionGemma for Faster Text Generation

Kickers.ai Launches AI Vision Prototyping Service for Fast Hardware Development

LatentView Analytics Launches BrickShift to Streamline Migration to Databricks AI/BI