DI4: Input validation

Anonymous (not verified)

Published on: 09/02/2016 Discussion

How to validate input?

Component

Documentation

Comments

Bart HANSSENS Tue, 09/02/2016 - 18:18

I'd assume this means conformity to the DCAT-AP 1.1 spec ?

Makx DEKKERS Wed, 10/02/2016 - 18:40

Yes, it's about conformace to DCAT-AP v1.1

Let's say you have a catalogue that contains description according to a national profile, or a catalogue that is based on some other specification. If you export data in DCAT-AP, you may want to verify that your export conforms with DCAT-AP.

In another scenario, you may run a aggregator and harvest data that is meant to conform to DCAT-AP but you need to make sure before loading it in your system that it indeed conforms to DCAT-AP.

It needs to be said that validating RDF data is an issue that is still very much under discussion in the RDF community. At W3C there is a group called RDF Data Shapes Working Group that is in the process of specifying the Shapes Constraint Language (SHACL), a language for describing and constraining the contents of RDF graphs.

That may be an interesting activity to follow.

Emidio STANI Thu, 11/02/2016 - 15:13

DG CONNECT has developed a DCAT-AP validator which should comply with DCAT-AP 1.1 specification and it is released as open source software under EUPL licence.

In practice the validator stores the RDF file (coming from different sources: file upload, URL, direct input) into the triple store connected and validates the graph created against the SPARQL query which is the union of the rules coming from the specification.

Currently the validator doesn't have an API which would allow to validate the RDF file programmatically during the harvesting phase.

Bart HANSSENS Wed, 24/02/2016 - 15:10

A library / command line validator would be nice (at the moment I'm just executing a series of SPARQL Updates queries to correct / upgrade DCAT-ish input from various sources to DCAT-AP-ish output)

Andrea PEREGO Thu, 24/03/2016 - 01:00

An issue worth mentioning is about the "granularity" of the validation.

I don't know if things have changed recently, but I remember that members of DCAT-AP and GeoDCAT-AP were reporting that the validation didn't pass if catalogue metadata were not provided.

This makes sense, e.g., when running the validator on the dump of a catalogue, but not in all the cases.

In other words, validators should be (also) able to test metadata about a dataset without requiring that they always comes together with catalogue metadata.

DI4: Input validation

Component

Category

Comments