
OpenEuroLLM: Europe's Ambitious Open Source Language Model Initiative
OpenEuroLLM, a new initiative to develop open source large language models (LLMs) for all European Union languages, has been launched as part of Europe's digital sovereignty roadmap, according to TechCrunch. This ambitious project aims to create models that are not only highly accurate but also transparent and mindful of the linguistic and cultural diversity of the European region.
The project is a collaborative effort involving around 20 organizations, co-led by Jan Hajič, a computational linguist from Charles University in Prague, and Peter Sarlin, CEO and co-founder of Finnish AI lab Silo AI. The initiative will cover the current 24 official EU languages, as well as languages from countries negotiating EU entry, such as Albania.
OpenEuroLLM is part of a broader trend in Europe prioritizing digital sovereignty. This includes initiatives like developing local cloud infrastructure to ensure EU data remains within the region. The project has a budget of €37.4 million, with €20 million from the EU's Digital Europe Programme, and will leverage EuroHPC supercomputer centers for computational resources.
Despite its ambitious goals, the project faces challenges due to the large number of participating organizations and the complexity of the task. However, it builds on the High Performance Language Technologies (HPLT) project, which has been developing datasets and models using high-performance computing since 2022. The first versions of the LLMs are expected by mid-2026, with final iterations by 2028.
We hope you enjoyed this article.
Consider subscribing to one of several newsletters we publish like AI Policy Brief.
Also, consider following our LinkedIn page AI Safety & Regulation.
More from: Regulation
Subscribe to Daily AI Brief
Daily report covering major AI developments and industry news, with both top stories and complete market updates