Verkor Introduces VerTQ, TurboQuant Accelerator for LLM Inference
Verkor has introduced VerTQ, described as the industry's first TurboQuant accelerator silicon IP, announced in a press release. The technology implements the TurboQuant algorithm developed by Google Research, which cuts key-value cache memory usage in large language models by a factor of 4.3 while maintaining or improving performance.
VerTQ compresses KV data and accelerates attention operations, including Flash Attention and online SoftMax, directly on-chip without decompressing data. This approach reduces memory bandwidth requirements and increases inference efficiency, especially for applications where memory is limited.
The chip was built autonomously by Verkor's Conductor 2.0 AI platform using standard electronic design automation tools. The process took about 80 hours from algorithm to a verified FPGA implementation. Mapped to a Xilinx FPGA running at 125 MHz, VerTQ supports between one and thirty-two attention decoders.
VerTQ is designed for edge AI systems such as autonomous vehicles, drones, and robots, where compact design, low power use, and cost efficiency are key. The VerTQ customer package includes specifications, verification IP, testbenches, and FPGA images, and the product is available now.
We hope you enjoyed this article.
Consider subscribing to one of our newsletters like Silicon Brief or Daily AI Brief.
Also, consider following us on social media:
More from: Data Centers
Subscribe to Silicon Brief
Weekly coverage of AI hardware developments including chips, GPUs, cloud platforms, and data center technology.
Market report
AI’s Time-to-Market Quagmire: Why Enterprises Struggle to Scale AI Innovation
The 2025 AI Governance Benchmark Report by ModelOp provides insights from 100 senior AI and data leaders across various industries, highlighting the challenges enterprises face in scaling AI initiatives. The report emphasizes the importance of AI governance and automation in overcoming fragmented systems and inconsistent practices, showcasing how early adoption correlates with faster deployment and stronger ROI.
Read more