Apple's New Video Language Model Outperforms Larger Models

August 23, 2025
Apple has introduced SlowFast-LLaVA-1.5, a video large language model that excels in long-form video understanding, outperforming larger models in efficiency and accuracy.

Apple has introduced SlowFast-LLaVA-1.5, a new family of video large language models (LLMs) designed for efficient long-form video understanding, according to a study published on arXiv. The model employs a two-stream SlowFast mechanism, which allows it to efficiently process long-range temporal contexts while maintaining high computational efficiency.

The SlowFast-LLaVA-1.5 model family ranges from 1 billion to 7 billion parameters and has been optimized through a streamlined training pipeline using publicly available datasets. This approach enables the models to outperform much larger models across various video tasks, achieving state-of-the-art results on benchmarks like LongVideoBench and MLVU.

Apple's model addresses several limitations of existing video LLMs, such as reliance on long context windows and complex training pipelines. By integrating video perception into pre-trained LLMs, SlowFast-LLaVA-1.5 not only excels in video tasks but also performs well on image tasks, including benchmarks for knowledge and math reasoning. The model is open-source and available on platforms like GitHub and Hugging Face.

We hope you enjoyed this article.

Consider subscribing to one of our newsletters like Daily AI Brief.

Also, consider following us on social media:

Subscribe to Daily AI Brief

Daily report covering major AI developments and industry news, with both top stories and complete market updates

Whitepaper

Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation

The 2025 AI Index by Stanford HAI provides a comprehensive overview of the global state of artificial intelligence, highlighting significant advancements in AI capabilities, investment, and regulation. The report details improvements in AI performance, increased adoption in various sectors, and the growing global optimism towards AI, despite ongoing challenges in reasoning and trust. It serves as a critical resource for policymakers, researchers, and industry leaders to understand AI's rapid evolution and its implications.

Read more