PR21 - Add new property to Distribution to refer to a sample of the data

10/03/2015

Description

From: http://joinup.ec.europa.eu/mailman/archives/dcat_application_profile/2015-March/000126.html

In our use cases, we often need only a part of the dataset. The dcat:Distribution implies that the whole dataset will be retrieved while a sample is enough to assess the dataset's content. In this regard, we would propose including also a sample property, such as the adms:sample property. (Not aware if there was a particular reason that was not included in the first place)

Proposed solution

Add property adms:sample to Distribution

Component

Documentation

Category

improvement

Comments

Wed, 25/03/2015 - 08:19

Proposed resolution: add adms:sample to Dataset

Wed, 08/04/2015 - 07:37

I believe this is important property. However it will have a drastic impact on today's dataset catalogs. As the requester writes, today this distinction is not made and distribution is being used for full and partial datasets.

 

Wed, 08/04/2015 - 12:31

I am not sure how adding the propery would have "drastic impact". If existing data catalogues allow linking of both full and partial distributions to datasets, they are technically not in violation of DCAT because DCAT is silent on whether "a specific available form of a dataset" (http://www.w3.org/TR/vocab-dcat/#class-distribution) implies that it's all the data or not.

The questions are, given that situation, (a) is there an urgent problem that we need to solve and (b) does adding the adms:sample property change the situation, in other words will catalogue owners be prepared to rebuild their data to use the sample property, or will they continue doing what they do?

Wed, 08/04/2015 - 14:34

I think that we are discussing here a different data provisioning paradigm. 

Currently (the vast majority of) data catalogues provide access to datasets and their distributions in a form of a yellow page directory. 

Adding such a property would mean that the owners of the datasets (who are usually different from the owners of the catalogues) would have to do extra effort and prepare the samples. I do not find it realistic, unless we are discussing only cases where data is provided as a service, e.g. as linked data. 

Thu, 09/04/2015 - 18:30

Nikos, data owners do not have to do extra work if this is an optional property. It can be used if a sample happens to be available. However, if Bert is right that samples are now published as Distributions, there might not be a need for a separate property. 

Fri, 17/04/2015 - 10:40

Improving semantics is alwasy good and one of the main purpouses of metadata. If publishers are currently using Distributions to provided data samples as well it looks like that corresponds just to cover one real necessity where no other more suitable option was available.

 

In my view, and from a data reuser perspective, one should be able to distinguish between just data samples or snippets and full distributions of data. That distinction is not possible using same property for both.

 

On impact, I think it is minor as well. Nothing will be breaking and improved semantics will be available for those willing to take advantage of them. 

Sun, 03/05/2015 - 17:32

Optional property adms:sample for Dataset was already in Draft 1, so this issue is fixed if there are no strong objections.

Sun, 07/06/2015 - 10:35

Login or create an account to comment.