Skip to main content

[Issue #06] Quality of Datasets

Published on: 29/05/2019 Discussion Archived

As mentioned during the webinar, and also proposed by Difi (via email), we may include metadata for quality of data sets/elements.

The W3C Data Quality Vocabulary (DQV) is the clear candidate for this purpose. The main question is the case of use for this quality assesment of the data:

  1. Express that a dataset fits in a quality classification (see example)
  2. Provide quality assessment with quality metrics —e.g., completeness— (see example)
  3. Express the conformance of a dataset's metadata with a standard (see example)
  4. Provenance of datasets (along with the W3C Provenance Ontology)
  5. Others...

DCAT-2 also includes some examples about quality.

This issue is related to [Issue #02] proposing the creation of the boolean property isAuthoritative.

Shared on

Last update: 26/05/2021

Access to Base Registries

Open Source SoftwareStandardisation+2 topics

Comments

Jim J. Yang Fri, 12/07/2019 - 14:17

I know this issue is closed (conf. webinar 11 July). Just to be precise about what we agreed upon.

"12. Quality information" of DCAT2 is non-normative (and the data model for DCAT2 doesn’t explicitly include DQV/PROV). Unless DCAT-AP2 is going to do it (which I don’t know), we need to include usage of DQV and PROV explicitly in BRegDCAT-AP, in order to have a standardized way of describing the quality information of the Base registries across the Member states. I mean, not just mentioning as examples, but also explicitly including in BRegDCAT-AP (or DCAT-AP2) optional properties like dqv:hasQualityMeasurement (range: dqv:QualityMeasurement); dqv:hasQualityAnnotation (range: dqv:QualityAnnotation); …

Furthermore, since DQV doesn’t define any quality dimensions, I would suggest that EU/ISA2 defines (e.g. by reusing existing (ISO-)definitions when applicable) a set of quality dimensions as a controlled vocabulary, in order to achieve semantic interoperability across the descriptions of the Base registries.

Otherwise, each of the Member states will have to handle it at the national level, by including DQV/PROV as national extensions to DCAT-AP/BRegDCAT-AP, with potentially different definitions of the same quality dimensions. As far as I know, at least Denmark, Czechia and Norway are using DQV and defining/defined some quality dimensions.