Dnotitia Unveils STAR-KV Compression Method Selected as ICML 2026 Spotlight Paper
Dnotitia Inc. has released the paper and source code for STAR-KV, a key value cache compression method that achieves up to 20 times reduction in cache size and faster inference speeds, announced in a press release. The research was conducted in collaboration with UC San Diego's VVIP Lab.
The paper, titled "STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control," was selected as a Spotlight paper at ICML 2026, which will be held in Seoul from July 6 to 11. Out of 23,918 reviewed papers, 536 received this recognition.
According to Dnotitia, STAR-KV reduces key value cache size by up to 75 percent using low-rank compression, and by up to 20 times when combined with a mixed precision quantization method. The technique also uses custom GPU kernels to improve attention computation speed by up to 6.9 times and overall generation throughput by up to 3.1 times.
Dnotitia plans to further develop STAR-KV for AI service environments and adapt it to open source large language model inference frameworks such as vLLM.
We hope you enjoyed this article.
Consider subscribing to one of our newsletters like Silicon Brief or Daily AI Brief.
Also, consider following us on social media:
More from: Data Centers
Subscribe to Silicon Brief
Weekly coverage of AI hardware developments including chips, GPUs, cloud platforms, and data center technology.
Whitepaper
Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation
The 2025 AI Index by Stanford HAI provides a comprehensive overview of the global state of artificial intelligence, highlighting significant advancements in AI capabilities, investment, and regulation. The report details improvements in AI performance, increased adoption in various sectors, and the growing global optimism towards AI, despite ongoing challenges in reasoning and trust. It serves as a critical resource for policymakers, researchers, and industry leaders to understand AI's rapid evolution and its implications.
Read more