Cohere Unveils Aya Vision: A Multilingual, Multimodal AI Model

Cohere Unveils Aya Vision: A Multilingual, Multimodal AI Model

Image: Cohere
Cohere For AI has launched Aya Vision, a new open-weight AI model that integrates language and vision capabilities across 23 languages. The model is available for non-commercial use on platforms like Hugging Face and WhatsApp.

Cohere For AI has introduced Aya Vision, an open-weight AI model designed to enhance multilingual and multimodal capabilities, announced on their website. Aya Vision supports 23 languages, aiming to bridge the gap in AI performance across different languages, particularly in tasks involving both text and images.

Aya Vision is available in two versions, 8-billion and 32-billion parameters, and can perform tasks such as image captioning, visual question answering, and text translation. The model is accessible for non-commercial use on platforms like Hugging Face and WhatsApp, allowing researchers and developers to explore its capabilities without commercial constraints.

Cohere emphasizes the model's efficiency, noting that Aya Vision outperforms larger models in various benchmarks. This is achieved through innovations like synthetic annotations and multimodal model merging, which enhance the model's understanding and performance across multiple languages. Aya Vision's release marks a significant step in making advanced AI technologies more accessible to the global research community.

We hope you enjoyed this article.

Consider subscribing to one of several newsletters we publish. For example, in the Daily AI Brief you can read the most up to date AI news round-up 6 days per week.

Also, consider following our LinkedIn page AI Brief.

Subscribe to Daily AI Brief

Daily report covering major AI developments and industry news, with both top stories and complete market updates