DCAT-AP

15/06/2016

The specification should be transformed so that it becomes a DCAT AP.

 

Reasons:

  • increase interoperability with users outside Germany
  • improve adherence to Linked Data principles
  • simplify the specification

See:

https://joinup.ec.europa.eu/asset/dcat-ap_implementation_guidelines/ass…

Component

Documentation

Category

improvement

Comments

Fri, 17/06/2016 - 07:41

The same note from my side:
I do not know the development history of OGD, but I would propose to create a german profile of (Geo)DCAT-AP which could also include specific german extensions (by staying interoperable with the (Geo)DCAT-AP foundation). In this case interoperability on european level would be easier and the metadata model would be based on an international standard.
The same is done by other countries like Schweden, Norwegen, Italien, Niederlande,...
e.g.:
https://docs.google.com/document/d/17-vEfZXlu9kykcmjXZo1_Z8QKkr7-Prgwd6…
https://docs.google.com/spreadsheets/d/1SPfA6fA9im_6EtxLDBtCMyBpt1q1aXl…

It would then be possible to attach a document providing further usage-conventions. We have done this for ISO19115 in the context of the german spatial data infrastructure (GDI-DE).

Fri, 17/06/2016 - 19:45

Of course we checked DCAT-AP 1.0.1 and DCAT-AP 1.1 but we encountered non-resovable semantic conflicts, gaps in granularity, missing concepts, non-matching vocabularies, poor information on future governance and at the end a German DCAT-AP that fits the requirement at the same level as OGD 2.0 aims today would contradict in several aspects what "application profile" means.

OGD 2.0 Entwurf as presented here is rather the "best of the both worlds" CKAN and DCAT-AP reusing about 15 years of German XOEV methodology and technology.  (https://www.w3.org/2013/share-psi/workshop/berlin/XOEV)

 

 

The OGD 2.0 Entwurf specification outlines several yellow and red-marked "DCAT-AP not mappable" issues.

 

So beside

1:the language issue (DCAT-AP is in English only) and beside

 

2:organisational aspects like no one being able to guarantee content of future IDA / IDABC / ISA / ISA² initiatives (those programmes are defined on a 4-year basis, no one in the EC can tell whether DCAT-AP will keep on being focus area of DIGIT in the next decade). See the GovData Kooperationsvereinbarung (German) for more information on the OGD publisher's current stable legal basis)

and

3: beside the fact that when using DCAT-AP one has to have an eye and follow the governance processes of the underlying standard set (Dublin Core, DCAT, ADMS, DCAT-AP, DCAT-AP Implementation guidelines) 

and

4: the lack of information how to restrict the too large VCARD and FOAF range used by DCAT-AP

and

5: furthermore the wish to be able to control the "functional distance" between the current standard OGD 1.1 and the future OGD 2.0 for those portal today having implemented CKAN / OGD 1.1 

and beside that

6: DCAT is a data catalog and OGD needs to express "documents" and "apps" also

 

there are also severe functional aspects that hindered us from reusing DCAT-AP "in a native way as an application profile" 

 

How would you solve the following conflicts without loosing information or give wrong information in the communication from CKAN to DCAT? (From DCAT to CKAN is even worse, as we step down in granularity)

 

granularity conflict:

7: License at distribution level (DCAT) vs. License at dataset level (CKAN)

 

8: Status at distribution level (DCAT) vs. status at dataset level (CKAN)?

 

Not existing concepts:

9:Accessibility class (Zugaenglichkeit)

 

10: sophisticated concept for roles, other than publisher or creator

 

11: inner media or mimetype for zip-containers

 

12: to express real relationships 

 

Not existing vocabularies:

13: Codelist of German Länder, codelist for Official Municipality Keys AGS

14: checksum algorithms other than the old SHA-1 (see spdx for the non-mature current status!)

15: Codelist of specific licenses, as compiled by Govdata.de

and

16: by the way the not existing possibility to tell the name and version of a codelist at data exchange time only between two communication partners, as suggested by Joinup:

https://joinup.ec.europa.eu/asset/page/practice_aids/code-list-recommendations-version-10

https://joinup.ec.europa.eu/asset/page/practice_aids/code-lists

 

and so on...

 

It is not that we haven't tried or that we are ignorant as you see in the chat log of the working group meeting on the DCAT-AP implemenation guidelines https://joinup.ec.europa.eu/sites/default/files/event/attachment/working_group_meeting_on_dcat-ap_implementation_guidelines_28-01-2016_0.docx

 

Also important effort was done in the DCAT-AP implementation guidelines in January 2016 https://joinup.ec.europa.eu/solution/dcat-application-profile-implementation-guidelines, if not we would have far more than the 16 issues presented here.

 

In short:

DCAT-AP as a data catalog standard federating the 28 Member State portals and serving the EDP in my oppinion can not also be the best-fit standard for an inner-German "dataset, document and app" federation as scope and scenario and governance is so different.

To conclued I am convinced that we need two different standards here and to my eyes this is not terrible nor a tragedy as long as they stay kind of "interoperable" and mappable. 

Loosing information when we go from German GovData federation using OGD towards the DCAT-AP european data portal "space" might be acceptable, but it is not acceptable for the inner-German federation where we go from municipality level to regions to the federal level (note that we have even horizontal communication and a few communications even "downwards" in the GovData federation)

Fri, 01/07/2016 - 22:25

Thanks a lot for the 16 items and the summary. I think this is a crucial issue and I will attempt to reply in detail. I am convinced that there are no non-resolvable semantic conflicts etc. Gaps in granularity and missing concepts can also be solved or filled.

 

1: Both the DCAT-AP specification text and the identifiers used in it are in English. But that does not prevent creating a DCAT-AP_DE specification document using German language for text and terms. Only the technical identifiers (class- and property-names) need to be in English. Examples are skos:Concept and dct:language. DCAT-AP_NL added a Dutch namespace and properties such as overheid:grondslagCiteertitel which I think is suboptimal regarding the language but is perfectly compatible with DCAT and DCAT-AP. Code-lists implemented using SKOS can be multilingual (ex EuroVoc and European MDR Data Themes). See also this issue: https://joinup.ec.europa.eu/asset/ogd2_0/issue/choice-topics-category/auswahl-der-themen-f%C3%BCr-kategorie

2: True, nobody can guarantee that the European Union will still exist in five years. But DCAT does not depend on the EU. It is a global standard issued by the W3C. Also note that the W3C Working Groups creating Recommendations usually no longer exist after the W3C has issued a Recommendation. That is one reason why transparency of standardisation processes including long public review processes are so important. The DCAT-AP 1.1 specification can be used independent of the existence of the EU (I agree that the specification can and should be improved and GovData should continue to be involved in that). There also is no guarantee that funding of GovData will be sufficiant in five years - I doubt that it is sufficiant now. BTW: the Project Open Data Metadata Schema v1.1 created by the US Governent is also based on DCAT.

3: Yes, it makes a lot of sense to track other standards when one is using them. That is something GovData, KoSIT, IT-Planungsrat etc. need to do (Open Data communities also should do that instead of reinventing incompatible wheels again and again). If their funding is inadequate to do that then that is a critical issue. The German constitution was changed to enable the IT-Planungsrat to issue standards which are legally binding for the whole German public administration. That indicates that such standards should be state of the art - especially when there are users outside the public administration and the specification is about Open Government and Open Data.

Sat, 18/06/2016 - 15:27

4: What are the requirements regarding restricting VCARD and FOAF ranges ?

5: Organising support for implementing https://github.com/ckan/ckanext-dcat/issues/53 might help a lot to move to the future.

6: Such requirements should be made explicit. As far as I am aware neither the original requirements for OGD nor the requirements stated in the current draft include expressing metadata for "documents" or "apps". Maybe these should better be dealt with in separate specifications anyway ? (Not every specification needs to be 100 pages long.)

Sun, 03/07/2016 - 14:45

7 and 8: These granularity conflicts are worth a separate issue: https://joinup.ec.europa.eu/asset/ogd2_0/issue/granularity-conflicts-li…

9: Regarding "Zugaenglichkeit" I created https://joinup.ec.europa.eu/asset/ogd2_0/issue/extending-dcat-ap-expres…

10: Roles justify a separate issue: https://joinup.ec.europa.eu/asset/ogd2_0/issue/kontakt-rollecode-and-dc…

11: For "inner media" a separate issue was already created: https://joinup.ec.europa.eu/asset/dcat_application_profile/issue/questi… That issue is solvable by extending DCAT-AP.

12: Expressing "real relationships": separate issue. Done: https://joinup.ec.europa.eu/asset/ogd2_0/issue/relationships

13: Codelist for "Bundesländer" (the states in Germany) is worth a separate issue: https://joinup.ec.europa.eu/asset/ogd2_0/issue/codelists-german-l%C3%A4… An issue regarding AGS was already created (https://joinup.ec.europa.eu/asset/ogd2_0/issue/property-ags-amtlicher-g…).

14: Checksums and algorithms are worth a separate issue. Discussed in https://joinup.ec.europa.eu/discussion/format

15: The range of dct:license is worth a separate issue: https://joinup.ec.europa.eu/discussion/license-restrictions-codelist-specific-licenses

16: I don't understand this issue. It is possible "to tell the name and version of a codelist at data exchange time" using Linked Data.

I definitely agree that information loss should be avoided. And I am convinced that this requirement can be fulfilled while basing the specification on DCAT-AP.

Mon, 20/06/2016 - 10:12

Hello,   I just did a quick analysis for the need for DCAT-AP comparing with the implementation of DCAT-AP_IT. I think it could be useful to trigger discussion at high level.   1) There is a concept of contact for "a element of a catalog" (OGD 2.0) and "organization as point of contact for a dataset" (DCAT-AP-IT)   OGD 2.0 DCAT-AP_IT name [0..1] name [1..1] e-mail [0..1] e-mail [1..1] phone [0..1] phone [0..1] website [0..1] website [0..1] mail address [0..1]   role [1..1]     2) Both need a sub theme for a dataset, OGD 2.0 would like to have a sub-theme also for a Catalog    OGD 2.0 DCAT-AP_IT sub theme [0..*] (for dataset) sub theme [0..N] (for dataset) sub theme [0..*] (for catalog)     3) OGD 2.0 is using checksum for a "resource" which has the following properties:   OGD 2.0 date [0..1] checksum algorithm [0..1]

 

but in DCAT-AP we have also the "checksum value", so we can propose them to add the value and on our side to add a date   4) Geometry has different structure:   OGD 2.0 DCAT-AP_IT free_text  [1..1]   coordinate 1 [0..N] coordinate [1..1] coordinate 2 [0..N]   coordinate system [0..1] coordinate system [1..1] geometry type [1..1]     5) OGD 2.0 needs an "identifier for a catalog and for a distribution" while for DCAT-AP_IT there is no need for it   6) OGD 2.0 needs a "mime type, version, version note for a distribution" while for DCAT-AP_IT there is no need for it

name [1..1]

Mon, 20/06/2016 - 18:40

Ciao Emidio,

Mille grazie, ma l'analisi è stata forse un po’ troppo veloce, ecco perché devo fare qualche correzione :)

 

Please note the following corrections to the analysis (but thank you, every contribution is more than welcome!)

 

OGD 2.0 has an own checksum class featuring algorithm AND value (same as DCAT-AP, but more algorithms allowed here)

 

GeometryType (where do you see it in the spec?) is no longer part of AbdeckungInKoordinaten as there is no need to distinguish Polygones, Lines and Points if you know that all you need is a bounding box.

The bounding box is what we want, as it makes no sense to duplicate all the Geodata with its complexity (that is fine for in the data) in the metadata standard again.

Wed, 22/06/2016 - 14:19

At least OGD 2.0 should bei interoperable with DCAT-AP, which it claims to be, but apperently is not: https://joinup.ec.europa.eu/asset/ogd2_0/issue/interoperability-dcat-ap-interoperabilit%C3%A4t-zu-dcat-ap

Mon, 27/06/2016 - 07:49

+1. A new standard will not only have its own problems but also the additional cost of maintaining a separate standard including tools for its use and for providing interoperability with existing solutions like DCAT-AP. If you're not satisfied with the current state of DCAT-AP then please work towards improving it instead of adding an incompatible alternative.

 

Some arguments that haven't been mentioned, yet:

 

"1: the language issue (DCAT-AP is in English only)"

In my opinion, this is an advantage instead of a disadvantage. English is the de-facto standard language in technical fields, using it for standards and their documentation ensures a global level of understanding that is impossible to reach with "local-only" languages like German.

 

So, even if you're not adapting DCAT-AP, please use English for the identifiers and the standard of ODG-2.0. Feel free to offer additional, non-normative translations.

 

See also Replace German by English class and attribute identifiers

 

 

"5: furthermore the wish to be able to control the "functional distance" between the current standard OGD 1.1 and the future OGD 2.0 for those portal today having implemented CKAN / OGD 1.1"

Obviously moving from OGD-1.1 to DCAT-AP is more demanding than moving to a custom standard that is tailored towards existing uses of OGD-1.1. However, the longterm efforts of maintaining both a separate standard and the tools necessary for its use (harvesters, converters, CKAN extensions, etc.) are much higher.

 

 

"6: DCAT is a data catalog and OGD needs to express "documents" and "apps" also"

Does it really? Where do you draw the line between data and a documents? In my experience, that is often a rather difficult and subjective decision.

 

But more important: What about using DCAT-AP's Dataset.type?

 

 

"14: checksum algorithms other than the old SHA-1"

As far as I understand your are using checksums to protect against accidential corruptions. SHA1 is a suitable choice for that (in contrast to cryptographic use cases where you have higher demands on collision resistance which SHA1 does not satisfy anymore). There are obviously better choices available now, but that's hardly a reason not to use DCAT-AP. It's like saying that you're not using HTTP because it doesn't use a checksum at all.

Sat, 23/07/2016 - 07:52

Mon, 25/07/2016 - 07:45

Vielen Dank für Ihren Beitrag. Wir werden nun eine Weile brauchen um alle Hinweise zu prüfen und Ihnen dann hier im Portal eine Rückmeldung geben.   Thanks a lot for your input. We will now take some time to review all posted issues. Afterwards you will receive our feedback on this website.

Thu, 27/07/2017 - 07:36

Vielen Dank für Ihre Rückmeldungen.

 

>>The specification should be transformed so that it becomes a DCAT AP.

>> Reasons:

>> increase interoperability with users outside Germany

>> improve adherence to Linked Data principles

>> simplify the specification

 

Wir haben uns nun entschieden, eine DCAT-AP konforme Ableitung zu erstellen und diese unter Wahrung der Konformitätskriterien und damit EU-Kompatibilität in einem Konventionenhandbuch für unsere GovData-Zwecke weiter einzuschränken.

 

Die Spezifikation des zukünftigen Standards steht hier in der Version 1.0 zur Verfügung: http:// http://www.dcat-ap.de/def/dcatde/1_0/spec/specification.pdf

 

Wir haben uns abschließend wie folgt mit den oben aufgeführten Einwänden auseinandergesetzt und diese bei der Ausgestaltung von DCAT-AP.de berücksichtigt:

 

>> So beside 1:the language issue (DCAT-AP is in English only) and beside

Wir verwenden nun eine deutsche Erweiterung des englischsprachigen DCAT-AP.

 

>> 2:organisational aspects like no one being able to guarantee content of future IDA / IDABC / ISA / ISA² initiatives

 

In der eigenen Domain dcat-ap.de ist der Namensraum der deutschsprachigen Ableitung http://dcat-ap.de/def/dcatde/ unabhängig von DIGIT persistent verortet und versioniert.

 

>>and

>>3: beside the fact that when using DCAT-AP one has to have an eye and follow the governance processes >>of the underlying standard set (Dublin Core, DCAT, ADMS, DCAT-AP, DCAT-AP Implementation guidelines) 

Wir erarbeiten aktuell ein Pflegekonzept, welches die Wege für Änderungsanforderungen an diese Standards thematisiert und stehen im Austausch mit ISA² zu zukünftigen Entwicklungen.

 

>> and 4: the lack of information how to restrict the too large VCARD and FOAF range used by DCAT-AP

 

Wir nutzen die Informationen dazu gemäß der Implementation Guidelines https://joinup.ec.europa.eu/node/150343/ und verwenden bei foaf:agent dcat-ap konform den foaf:name und die Angabe des Publishertypes mit dct:type.

 

Bei VCARD sind Telefon und oder E-Mail im dcat-ap.de Konventionenhandbuch (http://www.dcat-ap.de/def/dcatde/1_0/implRules.pdfempfohlen.

>>and 5: furthermore the wish to be able to control the "functional distance" between the current standard >> OGD 1.1 and the future OGD 2.0 for those portal today having implemented CKAN / OGD 1.1 

 

Der Sprung von OGD 1.1 auf DCAT-AP.de statt wie zunächst geplant von OGD 1.1 auf OGD 2.0 bringt deutlich größere strukturelle Veränderungen mit sich. Darum ist bis zur Umstellung auf DCAT-AP.de als alleinigem Anlieferungsformat für GovData ein langfristiger Umsetzungshorizont bis Ende 2018 geplant.

>> and beside that 6: DCAT is a data catalog and OGD needs to express "documents" and "apps" also

Dcat-ap.de fügt die für Dokumente gedachte nationale Erweiterung “legalBasisText“ hinzu.

 

 

>> there are also severe functional aspects that hindered us from reusing DCAT-AP "in a native way as an application profile" :

>> How would you solve the following conflicts without loosing information or give wrong information in the communication from CKAN to DCAT? (From DCAT to CKAN is even worse, as we step down in granularity)

>> granularity conflict: 7: License at distribution level (DCAT) vs. License at dataset level (CKAN)

Die Fachgruppe GovData ist das zuständige „Fachgremium“ im Sinne des Standardisierungsprozesses hat als solches entschieden einen zu DCAT-AP konformen Standard zu erstellen. Die zur Wahrung von DCAT-AP Konformität notwenige Granularität der Lizenzinformation wird daher für GovData von der Datenstruktur (dataset) zur Distribution (dct:license) verschoben.

Des Weiteren wird dieses Feld in DCAT-AP.de verpflichtend und Datenstrukturen aus dem dcatde-Namensraum müssen in der Eigenschaft dct:license auf eine (dynamische) Liste von bestimmten Lizenzen referenzieren.

 

>> 8: Status at distribution level (DCAT) vs. status at dataset level (CKAN)?

Es sind in dcat-ap.de der ADMS-Status “completed”, “withdrawn” und “deprecated“ erlaubt, nun auf Ebene der Distribution, der Granularitätskonflikt wird also hier in Kauf genommen und ein Mapping scheint möglich.

 

>>Not existing concepts: 9:Accessibility class (Zugaenglichkeit)

DCAT-AP.de enthält aktuell keine solche Klasse, es sind aber Aussagen zur geplanten Verfügbarkeit mit DCAT-AP.de möglich (dcatde:plannedAvailability http://dcat-ap.de/def/plannedAvailability/ )

 

>> 10: sophisticated concept for roles, other than publisher or creator

Wir haben DCAT-AP hier erweitert um die eigenen Rollen “dcatde:maintainer, dcatde:originator“ sowie die vormals in DCAT-AP 0.5 vorhandene Eigenschaft „dct:creator“ wieder eingeführt und die Eigenschaft „dct:contributor“ ergänzt.

 

>> 11: inner media or mimetype for zip-containers

Diese Eigenschaft wird für zukünftige DCAT-AP.de Versionen diskutiert.

 

>> 12: to express real relationships

Unsemantische Beziehungen können nun bei DCAT-AP.de mit „dct:relation“ und dct:hasVersion“ ausgedrückt werden. Eine Erweiterung um weitere semantische Beziehungen wie „ist Nachfolger“, „Teil von“ wird für zukünftige DCAT-AP.de Versionen diskutiert und ist in Abhängigkeit von Weiterentwicklungen bei DCAT und DCAT-AP zu sehen.

 

>> Not existing vocabularies 13: Codelist of German Länder, codelist for Official Municipality Keys AGS

Wir nutzen nun im Entwurf DCAT-AP.de V1.0 eine eigene URI-fähige Liste der deutschen Bundesländer. http://www.dcat-ap.de/def/politicalGeocoding/stateKey/

 

>> 14: checksum algorithms other than the old SHA-1 (see spdx for the non-mature current status!)

Wir haben in DCAT-AP.de einen eigenen erweiterbaren Namensraum mit Versionierungsschema (Beispiel MD5: http://dcat-ap.de/def/hashAlgorithms/md/5) und weitere Algorithmen vorgesehenen.

 

>> 15: Codelist of specific licenses, as compiled by Govdata.de

Diese Codeliste wird ergänzend zum Pflichtfeld dct:license über das Konventionenhandbuch für an GovData anliefernde Portale verpflichtend als URI referenziert. Namensnennungen  werden über eine eigene Eigenschaft dcatde:LicenseAttributionByText ausgedrückt.

 

>> and 16: by the way the not existing possibility to tell the name and version of a codelist at data exchange time only between two communication partners, as suggested by Joinup:

Es werden nun zu Lasten von XÖV Codelist-Mechanismen die mit DCAT-AP als RDF-Vokabular eingeführten URI-Referenzierungsmechanismen für Codelisten verwendet. 

 

The content of this field is kept private and will not be shown publicly.