Open source tool allows sharing of public sector datasets
Open data portals in Italy, Sweden and Belgium are working on validators for the EC’s DCAT-AP. Data portals that use the World Wide Web Consortium’s Data Catalog Vocabulary make it easier for others to search and use their datasets, including across borders.
By methodologically listing where datasets can be downloaded and what formats are available, W3C’s DCAT instructions make its easier for others to discover these data collections. Instead of stockpiling data, DCAT-enabled repositories can be federated, with search results pointing to data available on other web sites.
The DCAT-Application Profile for data portals in Europe (DCAT-AP) describes datasets created by European public administrations. Work on the DCAT-AP began in 2013. Initiated by the European Commission’s Directorate General for Communications Networks, Content & Technology (DG Connect), the EU Publications Office and the EC’s ISA Programme, the creation of this specification involved representatives from 16 European Member States.
“DCAT-AP is implemented, or being implemented in Belgium, Italy, the Netherlands, Spain, Sweden and Switzerland”, says Athanasios Karalopoulos, one of the ISA project officers. “The European Data Portal uses it to harmonise the descriptions of more than 258,000 datasets, aggregated from 67 data portals in 34 countries.”
In February this year, the ISA Programme started a revision of the DCAT-AP. This summer, DG Connect added the DCAT-AP Validator. This software solution compares the metadata descriptions of datasets to the DCAT-AP. The DCAT-AP Validator can be downloaded from Joinup - the EC’s interoperability solutions platform. The software is available as open source, published under the European Union Public Licence.
The DCAT-AP Validator is already being reused in Italy. In September the Agenzia per l'Italia Digitale (AGID) - the country’s Agency for the Digitalisation of the Public Sector - started customising the software, to define and promote Italian datasets. One of AGID’s goals is to propagate dataset information to the EC’s European Open Data Portal.
Customisation and alternatives
“As far as we can see, we are the first European open data catalogue to have created and published a DCAT-AP in JSON-LD”, says Andrea Volpini, founder of Insideout10, the ICT company contracted by AGID to work on its DCAT project. The first version of AGID’s solution is live. “The software is already listing over 10.000 datasets”, Volpini says.
In August, the city of Umeå (Sweden) announced that is is using a customised version of DCAT-AP to publish its open data. “It is hard to know if it is good enough for search spiders” commented Umeå’s DCAT project manager Thomas Kvist, “so feedback is welcome.”
A substitute for the DCAT-AP Validator is the one made available by the Belgian chapter of the OpenKnowledge network. Their DCAT.BE website is supported by the country’s Federal ICT organisation Fedit and the Flanders Region. Their validation library is open source under the ISC License.
Those interested in DCAT-AP implementations, should register for the online workshop on 28 January. The webinar is organised by the ISA Programme’s Semantic Interoperability Community (SEMIC).