OpenGPT-X has presented "Teuken-7B", a language model that has been trained in 24 official EU languages. With seven billion parameters and open source access, it aims to promote European AI research and improve data protection.

You've probably heard of ChatGPT, Gemini, Claude and co. These large language models are currently revolutionising the way we interact with computers. The majority of these AI language models originate from the USA. The OpenGPT-X research project has set itself the task of developing a European and, above all, more data protection-friendly alternative.

OpenGPT-X recently published a new open source AI language model called "Teuken-7B", as the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS announced. It is now available on Hugging Face for download.

Training in 24 official EU languages

This model is special because it has been trained from the ground up with the 24 official languages of the European Union. This means that "Teuken-7B" can handle European languages particularly well. The focus was even placed on the representation of non-English languages in order to set it apart from the models developed in the USA and China.

"Teuken-7B" has seven billion parameters. This number of parameters enables the model to understand and generate complex texts. This makes it particularly useful for text processing, translation and answering questions. The developers of OpenGPT-X have emphasised that "Teuken-7B" is specifically designed to be used in various areas of AI research and application. The language model was trained using the "Juwels" supercomputer at the Jülich Research Centre in Germany.

One feature of "Teuken-7B" is its open source nature. This means that developers and researchers worldwide have free access to the model and can adapt and further develop it according to their needs. OpenGPT-X hopes that this will further drive innovation in AI research. The open source model also allows the model to be used and developed in an ethical and responsible way. The research project has already announced that they will continue to work on improving and expanding "Teuken-7B".

Who is behind OpenGPT-X?

The OpenGPT-X research and development project was founded in early 2022. The aim is to develop an AI language model based on European values and its linguistic diversity. It is being led by the two German Fraunhofer Institutes for Intelligent Analysis and Information Systems (IAIS) and for Integrated Circuits (IIS). The Technical University of Dresden, the Jülich Research Centre and companies such as Aleph Alpha and Ionos are also involved.

Mistral AI: Another European pioneer in the field of AI

It is worth noting that "Teuken-7B" is not the only major AI language model from Europe. The French software company Mistral AI has developed several of its own open source language models. These include "Mistral 7B", "Mistral 8x7B" and "Mistral 8x22B". The company was founded in April 2023 by researchers Arthur Mensch, Timothée Lacroix and Guillaume Lample, who previously worked at Meta and Google DeepMind.