Skip to main content

Graph4Jan - Slovakian Linked Open Government Data 18/07

Graph4Jan (LODSlovakia 18/07)

Published on: 07/09/2023 Last update: 22/07/2018 News Archived

1) Hackathon AllForJan

During April 2018, the FIIT STU has organized the AllForJan hackathon as a reaction to murder of young investigative reporter Ján Kuciak and his fiancée Martina Kušnírová. One of the main topic of last unfinished Jan's story was related to suspicious euro-fundings in agricultural sector (Agricultural Paying Agency). Since the data about agricultural public fundings are published in most minimal way and machine readable format is not available, it is very difficult (impossible) to explore agricultural fundings data with present applications. Therefore, the main objective of the hackathon was to provide tools for searching and analyzing agricultural public fundings data.

I was attending the AllForJan hackathon, but unfortunately, I have not developed the solution then, I was just able to present linked data representation of several agricultural entities. But now, after several months, we are pleased to announce the LODSlovakia 18/07 (coded as Graph4Jan), the Slovakian linked open government data with focus on agricultural fundings.

2) Graph4Jan

2.1 Problem

Many registers - poorly related data

The situation in Slovakian public data is far from "trend-setting". For example: even the key large projects of reference registers finished lately, such as the Legal Subject Register, the Address Register or the Physical Person register, these registers are not directly referenced. For instance, a headquartes of a legal entity is stored separately in the Legal Register, without any reference to address record (reference data) in Address Register. Moreover, these registries contain unresolved duplicates (some different legal subjects are the same) and duplicated identifiers (some different legal subjects have the same identifier). 

Agro Porúbka example

Company Agro Porúbka is not linked to murder of Jan and Martina. It is rather well known example of many systematic fails when working with public resources. So additionally I will use this company also for presenting a poor availability of public data.

So when I want to look at AgroPorúbka`s public fundings, I visit the apa.sk (Slovak Agricultural Paying Agency) and find Agro Porúbka`s requests for public fundings. I get

Agro Porúbka

It seems that Agro Porúbka created 3 requests, or 3 type of requests? Nevertheless, the information about everything is very poor. However, below this form there is additional form for searching over the related soil block that is an object for agricultural funding. But you have to know the code of soil block already, since it is mandatory field. So you have to find it somewhere else. If you want to visualize soild block, you have to use another separated information system, which uses coordination system from old czechoslovakia.

Soil Block

Since I want to learn more about the Agro Porúbka company, I visit the official Business Register, and I get some base properties of company, such as name, headquarters address, identification number and others:

Agro Porúbka

If I want to get some information about the headquarters address of Agro Porúbka, I can get only simple record from separated address register as follows

Address

So the data about one entity (company) are distributed into large amount of registers without referential integrity. There is not possible to work with these all data as with one coherent part.

2.2) Solution

LODSlovakia.EU - Slovakian Linked Open Government Data

Since 2014, our little group of linked data enthusiasts (lodslovakia.eu) started to propose semantic web standards to PS1 data standardization working groupt, to allow working with Slovakian government data as with linked data. Finally, several linked data standards for public data were approved that are today packaged as the Slovakian Semantic Interoperability Framework. However, our most important product is published open data  named as LOD Slovakia (Linked Open Data). LOD-SK is linked data representation of selected Slovakian base registries data that. LOD-SK is published periodically from 2016.

Since the last articles of Jan Kuciak were related to the Slovakian agricultural public fundings and that data are published very poorly, the latest processed and added datasets to the LOD Slovakia were these datasets. The present version of LOD Slovakia is coded as Graph4Jan.

http://lodslovakia.eu

LOD Slovakia EU

Agro Porúbka as linked data

Linked Government Data

Data:

LODSK Data

Slovpedia.com

http://lod.slovpedia.com/index.html

Slovpedia is first search engine for slovakian linked government data. Slovpedia loads full LODSlovakia and provides multiview search and exploratory capabilities of linked data. Slovpedia integrates Tripleskop visual analytical tool, hence it is possible to explore government data at the graph level.

Slovpedia

When I start to search Agro Porúbka, system adds semantics and URI for suggestion:

Agroporubka

After I choose desired company Agro Porúbka, the system gets me all companie`s existed properties, hence I can see it`s headquarters address; and all it`s requests and received public fundings, even data from the business register, and all registers that have any data about the company.

http://lod.slovpedia.com/#/https://data.gov.sk/id/legal-subject/36574058

Agro Porúbka

Now I change view to triples with Tripleskop. The follow graph represents Agro Porúbka in Linked Data.

http://lod.slovpedia.com/tripleskop/?uri=https://data.gov.sk/id/legal-subject/36574058

Agro Porúbka triples

The Tripleskop offers lots of RDF graph functionalities, such as exploring graph or also constructing data queries visually. It is also the native way to work with linked data, but detailed explanation of this great technology goes beyond the boundaries of this article. So I stitch to the classical view and select a public funding of Agro Porúbka. After opening I can see several properties of request, but I am interesting in funding object, a soil block in this case:

http://lod.slovpedia.com/#/https://lodslovakia.eu/id/agro/soil-block-part/-1311533630-0202-1

Soil block part

When I view the soil block, I can see it`s all properties, even the all related requests for public fundings, so I can see all related agricultural activities to this agricultural land.

Public Funding Vocabulary

Finally. To increase not only transparency but machine processing of fundings data, we have also developed the Public Funding Vocabulary. It defines concepts such pfund:PublicFunding, pfund:receiver, pfund:requestor, pfund:paidValue, pfund:CallForFundings and many others, using which is possible to describe arbitrary data about public fundings. The vocabulary being examined against present linked data ontologies for public fundings. Of course, it is possible to use Slovpedia to dereference entities of the vocabulary. Following example shows properties of the Public Funding class.

http://lod.slovpedia.com/#/https://data.gov.sk/def/ontology/public-funding/Funding

Public Funding Vocabulary

Hence it is possible to define PFV-AP-SK, i.e. Public Funding Application Profile for Slovakia, which defines scope and structure of published data that represent public fundings. Which classes and properties are mandatory, recommended and optional. This profile is actually being build and will be published soon for public review.