PR17 - Add new property to Catalog to express the size of the catalogue



Add new property to Catalog to express the size of the catalogue

Proposed solution

Add optional property :catalogSize to Catalog, range dct:SizeOrDuration, using value of an integer and the equivalent textual representation.






Tue, 24/03/2015 - 20:00

Proposed resolution: add property dct:extent to Catalog

Thu, 26/03/2015 - 10:51

I assume the here the number of datasets is aked. This is a property that can be calculated from the catalog itself and if it is required prior to any further tasks every store usually provides an easy way to ask for this (e.g. via SPARQL). The only situation were it would make sense to have it is when the whole catalog is textually presented to a human and then hopefully the property is at the very beginning or end of the "huge file".

And finally, it is a potential source for inconsistency, especially when humans are involved ;-) Any non-human should be able to just quickly count the datasets in the catalog.

Wed, 08/04/2015 - 07:27

I agree with Simon. The number is actually subject to many interpretations. E.g. in CKAN this is per default the number of 'active datasets', ignoring the records having a different 'status'.

For me this is more part of the catalog service API to make sure that the number is equal to the data being returned. Also larger catalogs could consider to provide a dedicated statistics API request.

In this aspect one could think of associating a catalog with catalog statistics (e.g. a DataCube) that will reflect the situation at some time (number of datasets, number of distributions, etc ...) This goes beyond the simple size request, but it has the ability to provide precaculated statistics associated to a timestamp (when the dataset catalog was in a consistent state) .


Wed, 22/04/2015 - 22:23

Only negative opinions received. No change will be made.

Mon, 27/04/2015 - 16:51

The content of this field is kept private and will not be shown publicly.