AMD Unveils Instella-VL-1B Vision Language Model

AMD Unveils Instella-VL-1B Vision Language Model

AMD has introduced its first vision language model, Instella-VL-1B, trained on AMD GPUs to deliver competitive performance in visual language tasks.

AMD has announced its first vision language model, Instella-VL-1B, which is trained on AMD's Instinct MI300X GPUs. This model is part of the Instella family of language models introduced by AMD in March 2025. Instella-VL-1B is a multi-modal model featuring 1.5 billion parameters, combining a vision encoder with 300 million parameters and a language model with 1.2 billion parameters.

The model was developed using datasets such as LLaVA, Cambrian, and Pixmo, and was further enhanced with document-related datasets like M-Paper and DocStruct4M. With a new pre-training dataset of 7 million examples and a supervised fine-tuning dataset of 6 million examples, Instella-VL-1B outperforms similarly sized open-source models on general visual language tasks and OCR-related benchmarks.

AMD has made Instella-VL-1B fully open-source, sharing not only the model weights but also detailed training configurations, datasets, and code. This initiative underscores AMD's commitment to advancing open-source AI technology in the field of multimodal AI.

We hope you enjoyed this article.

Consider subscribing to one of several newsletters we publish like Silicon Brief.

Also, consider following us on social media:

Subscribe to Silicon Brief

Weekly coverage of AI hardware developments including chips, GPUs, cloud platforms, and data center technology.

Whitepaper

Stanford HAI’s 2025 AI Index Reveals Record Growth in AI Capabilities, Investment, and Regulation

The 2025 AI Index by Stanford HAI provides a comprehensive overview of the global state of artificial intelligence, highlighting significant advancements in AI capabilities, investment, and regulation. The report details improvements in AI performance, increased adoption in various sectors, and the growing global optimism towards AI, despite ongoing challenges in reasoning and trust. It serves as a critical resource for policymakers, researchers, and industry leaders to understand AI's rapid evolution and its implications.

Read more