PLLuM: Polish Large Language Model – Artificial Intelligence

News

The Polish technology landscape has been enriched with a new, advanced language model – PLLuM (Polish Large Language Model). This open project, initiated by a consortium of six leading Polish scientific institutions, aims to support public administration, businesses, and academic communities in processing and generating Polish-language texts.

Preview Image

PLLuM: A New Era of Polish Artificial Intelligence

PLLuM was officially presented on February 24, 2025, by the Ministry of Digital Affairs, with its implementation announced on the gov.pl portal. The model stands out for its flexibility and scalability, operating on structures ranging from 8 to 70 billion parameters, enabling precise text generation in Polish. Its foundation is a vast text corpus of approximately 150 billion tokens, carefully selected and cleaned for linguistic accuracy and thematic diversity.

The PLLuM project is the result of a collaboration between the following institutions:

  • Wrocław University of Science and Technology (project leader) – responsible for developing algorithms for modern language models.
  • NASK National Research Institute
  • Institute of Computer Science, Polish Academy of Sciences – conducting research on the ethical aspects of AI development in Poland.
  • National Information Processing Institute
  • University of Łódź
  • Institute of Slavic Studies, Polish Academy of Sciences

The project’s goal is to create a tool that not only meets the needs of public administration but is also accessible to a wide range of users, fostering innovation in the private sector.

BielikAI: A Pioneer in Polish Language Models

Another significant Polish language model is BielikAI, developed by the SpeakLeash Foundation in collaboration with the Academic Computer Center Cyfronet AGH. BielikAI also operates in the field of artificial intelligence.

The first version of Bielik, based on the Mistral-7B architecture, was introduced in 2024 and featured 7 billion parameters. It was trained on a Polish language corpus comprising over 70 billion tokens.

In August 2024, the second version – Bielik v2 – was released, bringing significant improvements to its natural language processing algorithms. This model was expanded to 11 billion parameters and features a wide context window supporting up to 32,768 tokens, allowing it to process longer and more complex texts. This makes Bielik v2 one of the most powerful language models ever developed in Poland.

Comparison of Key Features of Polish Language Models

Below is a comparison of the most important features and applications of PLLuM and BielikAI:

Feature/Application PLLuM BielikAI
Number of Parameters 8–70 billion 11 billion
Training Data Scope Approx. 150 billion tokens Over 70 billion tokens
Main Applications Public administration, business, science Content generation, text analysis, and AI applications (e.g., ChatGPT-like tools)
Availability Open license, available to all Open-source, available on Hugging Face
Unique Features Scalability, adaptation to Polish language and administrative terminology Wide context window (32,768 tokens), ability to process longer and more complex texts

Both models represent a significant step forward in the development of Polish artificial intelligence, offering advanced natural language processing tools and supporting various sectors of the economy and administration.

The Development of Artificial Intelligence in Poland

Polish language models like PLLuM and BielikAI mark a breakthrough in AI development for the Polish language. PLLuM, developed by a consortium of leading research institutions, offers broad scalability and precision, supporting public administration, businesses, and academia. On the other hand, BielikAI, created by the SpeakLeash Foundation and Cyfronet AGH, focuses on content generation and text analysis, ensuring open accessibility for research and technology communities.

While these models differ in terms of parameters, training data scope, and applications, they share a common goal – developing innovative language tools that improve communication, process automation, and access to advanced AI technologies in Polish. The rise of such initiatives demonstrates Poland’s active participation in the global AI race, creating its own cutting-edge solutions tailored to unique linguistic and cultural needs.

Ka
Author of the article
Account Manager
Karolina

We have managed to extend software engineering
capabilities of 70+ companies

ABInBev logo
Preasidiad logo
ServicePlan logo
Tigers logo
Dood logo
Beer Hawk logo
Cobiro logo
LaSante logo
Platforma Opon logo
LiteGrav logo
Saveur Biere logo
Sweetco logo
Unicornly logo

...and we have been recognized as a valuable tech partner that can flexibly increase
4.8
...and we have been repeatedly awarded for our efforts over the years