Aligned with the Action Plan of the Estonian government to promote the introduction of artificial intelligence (AI) in its public sector, a base component for AI-based applications was made available for the first time in the source code repository of the Estonian state. TEXTA toolkit, an AI based text analysis tool was published on the government open source repository.
An open source code repository powered by the Estonian government
The use of open source is one of the goals of the Estonian National Action Plan to take digital government to the next level. Since spring 2019, Estonia has launched the first version of a government repository platform, the koodivaramu.eesti.ee, where open source software solutions developed for the government are made public and freely accessible. The long-term goal of the platform is to build community-based solutions for public administrations and the code repository will be a cornerstone of that project. Thus, the government is planning to make all source code of these solutions open and freely accessible, unless restrictions are required for security reasons.
The government is combining the wider digital state and information society development goals with the use of Artificial Intelligence. In this regard, Estonia is aiming to test, commission, and make available base components of AI-based standard applications in order to speed up the implementation and uptake of AI-based solutions within the public and private sectors.
A repository with the latest technologies: TEXTA, the first base component for AI-based solutions
The Estonian repository Koodivaramu recently welcomed a new addition: The Terminology EXtraction and Text Analytics (TEXTA) tookit, the first base component for AI-based solutions. TEXTA Toolkit is a set of tools developed by TEXTA OÜ, to carry out text analytics tasks. The toolkit allows users to analyse data collected from vast and/or complex free text datasets. Its main components are a searcher application, a classification tool, a data extractor and a terminology analysis tool. The source code of the Texta toolkit is on GitHub. To develop this toolkit, TEXTA OÜ received grants from several public institutions, as well as from the European Commission Horizon 2020 Programme, the startup grant of the Enterprise Estonia association and the Estonian Language Technology Programme.
To match the objectives of the Estonian government, the repository set up a collaborative experiment between the public and the private sectors. The development of the TEXTA toolkit was carried out through a technology sandbox. In a press release of the Ministry of Economic Affairs and Communications, the Estonian Government Chief Information Officer (CIO) and Deputy Secretary-General for IT and Telecom Siim Sikkut stated that “with the Sandbox Framework, we are opening up the opportunity of cooperative development whereby a company, university or individual developer can improve existing solutions or even create new solution and have a State as client, and the State receives the developed solution for free use. With the reference of Estonia as an advanced digital country, the creator of the solution can then go on to resell it to a wider market”.
Siim Sikkut also added that "relying on common solutions in places where there's no point in reinventing the wheel has been one of the mainstays of the Estonian digital state.” Therefore, in the recent years, several public administration bodies of the Estonian government have used the TEXTA toolkit. For instance, the Estonian Ministry of Education and Research used the software to audit its document management system. The software was therefore able to analyse more than 800,000 documents in order to determine which were not suitable to be publicly accessible. Another example is the use of the TEXTA software by the Estonian judicial system. TEXTA toolkit was used as an analysis engine to process the numerous documents of the registry of judicial decisions and identify the results of the lawsuits.