Ai2 Unveils OLMo 2 32B, Surpassing GPT-3.5 and GPT-4o Mini
The Allen Institute for AI (Ai2) has released OLMo 2 32B, a fully open model that outperforms GPT-3.5-Turbo and GPT-4o mini on various benchmarks, announced on their website. This model is the largest in the OLMo 2 family and is trained on 6 trillion tokens, utilizing the Tulu 3.1 post-training recipe.
OLMo 2 32B is notable for its efficiency, requiring only a third of the training compute compared to similar models like Qwen 2.5 32B. The model is part of a series that includes 7B and 13B parameter versions, all of which can be fine-tuned on a single H100 GPU node. Ai2 has made all data, code, and model weights freely available, fostering open scientific research and development.
The model's training infrastructure was supported by Google Cloud Engine's Augusta cluster, which consists of 160 nodes equipped with H100 GPUs. Ai2's OLMo-core framework, designed for efficiency on modern hardware, played a crucial role in the model's development. OLMo 2 32B is already integrated into Hugging Face's Transformers library, providing researchers with tools to explore and customize the model further.
We hope you enjoyed this article.
Consider subscribing to one of several newsletters we publish. For example, in the Daily AI Brief you can read the most up to date AI news round-up 6 days per week.
Also, consider following our LinkedIn page AI Brief.
Subscribe to Daily AI Brief
Daily report covering major AI developments and industry news, with both top stories and complete market updates