[Issue #01] Content of a Base Registry: Datasets, Data Services, other resource?

15/05/2019

Analysing the use cases presented during the webinar on ABR, in April 2019 (i.e., Denmark, Norway, Malta), we can see that National Registries contains not only datasets (file downloads) but also (REST, SOAP) services.

As Peter W. mentioned, DCAT 2.0 enables the description of those contents in detail, using the following classes:

  • dcat:Resource
    • dcat:Dataset 
    • dcat:DataService (including: endpointDescription, endpointURL, license, accessRights)

Another example, mentioned by Bart Hanssens, is the federal model of Belgium. It is based on a specific class named Data Source (subClasss of dcat:Dataset).

 

Proposal: To include in the specification both dcat:DataService and dcat:Dataset.

Should dcat:Dataset be refined as DataSource or similar such the Belgian approach did?

Shared in

Comments

Wed, 22/05/2019 - 11:34

Which dataset criteria should be used in order to classify National Registries dataset?

Thu, 23/05/2019 - 15:34

Thanks for your comment.

National Registries may include Datasets (e.g., a downloadable file) but also DataServices as an intermediate mechanism to access the data (e.g., REST endpoints).

About the classification criteria, there is an open discussion about this in [Issue #04] Subject of Datasets (thematic taxonomies to classify them) and in [Issue #05] Content of Datasets (DataElement) (composition of datasets).

Sun, 26/05/2019 - 15:06

The point about DCAT using the dcat:Resource as the main class is that (as with the FRBR approach to cataloguing publications) the same definitive dataset can have a variety of representations - some might be linked to services, whilst others might be within other representations (which might include printed documents for reference or archive).

 

So, rather than complicating issues and using a subclass of dcat:Dataset  for the data source (which in itself is a bit ambiguous as the 'source' could be a service or a file), I think that keeping close to the DCAT model is the simplest and least ambiguous approach.  It also has a similarity to the IFLA FRBR model, and that too is a benefit.

Thu, 30/05/2019 - 16:21

In our case, we have services that access even to subsets of datasets, because we use these services for OOP and we try to provide only the relevant data for the purposes of the public services. I don't see now how represent these subsets... but I vote 1+ for the proposal

Fri, 07/06/2019 - 12:39

I think the DCAT2 approach is the simplest and covers all the cases. It will be the standard, and perhaps the DCAT-AP will support it as well. So +1 to Peter's proposal. 

Mon, 10/06/2019 - 14:52

Makes sense, +1 on the DCAT2 approach

Login or create an account to comment.