OP1 - How to discover DCAT-AP data on websites

10/03/2015

Description

Fromhttp://joinup.ec.europa.eu/mailman/archives/dcat_application_profile/2015-March/000130.html:


How about finding /discovering DCAT-AP on websites (not necessarily portals)?

 

Proposed solution


There should be some mechanism / convention to export a dcat-ap file to a specific location
(similar to favicon.ico or sitemap.xml on websites, perhaps a "link rel='dcat-ap'" will do).

Component

Miscellaneous

Category

improvement

Comments

Fri, 13/03/2015 - 08:22

Useful for harvesting DCAT-AP data from various websites, so the website owners do not have to manually enter the meta-data in their own website _and_ in various portals (regional, EU-level...).

This would benefit smaller and/or decentralised institutions/public services (or, why not, companies) not willing or not able to deploy/integrate a full set of data portal-software: if their WebCMS could just extract/export some metadata to a file and publish it, and if this DCAT-file can be discovered automatically by open data portals, this will greatly reduce the complexity and costs for keeping the information up to date and in sync.

 

Some suggestions:

  • add "DCAT:" entry to robots.txt (not unlike the "Sitemap:" entry pointing to sitemap.xml files) 
  • use link rel="alternate" or link rel="dcat"

Fri, 13/03/2015 - 08:56

I wonder whether this issue is related to embedding machine-readable metadata in the pages of data portals, by using RDFa, Microformats, Microdata. This would improve data discovery from popular search engines.

On this, see also a use case of the W3C/OGC Spatial Data on the Web WG. 

Wed, 25/03/2015 - 12:52

Hi, we've done quite a bit of work on autodiscovery of open datasets for UK universities. See opd.data.ac.uk

 

The technique will work for any dataset, you just need to define the pattern of discovery. Auto-discoverable open data profiles have already been deployed by 21 UK institutions including Oxford, Cambridge, Manchester, Queen Mary & Southampton.

 

 

Thu, 26/03/2015 - 18:06

I quite like Christopher's plan of using a <link> header in the institution's home page, or a well-known place elsewhere on the site.

 

I guess it's sensible to offer people a choice of whether they configure it in server config or HTML code. Christopher, did you find it useful to offer both?

 

The Americans have an example we could follow - they have been successful getting hundreds of public organizations to put their metadata at the known location /data. It allows little organizations with 5 dataset to put up a static file, or for those with 1000s of datasets in a catalog they can serve them in page chunks, using a parameter.

Tue, 07/04/2015 - 14:08

Hi all,

 

The proposal of Christopher looks to be the shared opinion. If we are going to recommend this, shall we discuss concrete suggestions?

 

1. Link header

    (proposal: point to catalog instead of a single dataset)

<link rel="dcat-catalog" href="http://example.gov.be/catalog.ttl" />

 

2. Well-known

If your homepage is http://example.gov.be then http://example.gov.be/.well-known/data-catalog should serve (or redirect to) your catalog document.

 

    Question: what in case the publisher is not owner of the top level homepage?

 

kind regards,

 

Bert

Wed, 22/04/2015 - 22:11

Out of scope for revision of the specification. May be considered for future activity.

Mon, 27/04/2015 - 17:04

Login or create an account to comment.