How to use accessURL and downloadURL?

4 years ago

Issue

In some cases, accessURL might not be needed and the information in downloadURL and accessURL is duplicated. The fact that downloadURL is considered as a specific form of accessURL but without being defined as a sub-property of the second would suppose a duplication of information in certain cases due to the mandatory specification of accessURL. Is the best way to handle this to change the requirement to ask either for the accessURL or the downloadURL for any dcat:Distribution?   The previous discussions regarding this issue are available here.

Current situation

At the moment, DCAT-AP v1.1 defines:

accessURL: this property contains a URL that gives access to a Distribution of the Dataset. The resource at the access URL may contain information about how to get the Dataset.

downloadURL: this property contains a URL that is a direct link to a downloadable file in a given format.

Currently, DCAT specifies that you can use the distribution to point, for example, to API but DCAT is not clear about how this should be used. This lack of guideline is justified by the fact that no standard exists.

Recommendation

  • The dcat:accessURL should be used as a direct access to a file or to a page containing further instructions. It is mandatory and guarantees the existence of descriptions for the distributions. While the dcat:downloadURL is a direct link to a file. It allows software programs to use the link to get access to the file.
  • If only direct download access can be provided, the URL of the data should be duplicated in both accessURL and downloadURL.
  • A list of distribution types of NAL was developed by the Publications Office of the European Union. This list is available in the following link: http://publications.europa.eu/mdr/authority/distribution-type/ and can be enriched.

Rationale

Access to data is always conveyed in dcat:accessURL, irrespective of whether access is directly to a file or to a page where further instructions are given, together with more user-oriented information about the data. As the provision of dcat:accessURL is mandatory, implementations that receive descriptions of distributions can rely on the information to be there, which can then for example be shown to the user.

In addition, where it is possible to link directly to a file, dcat:downloadURL can be used, and this is done in several implementations. This allows software programs to use the link to get  access to the file.

Please note that in the case where only the direct download access can be provided, the URL of the data should be duplicated in both accessURL and downloadURL.

Example

The example is based on the Nobel Prize catalogue available via http://www.nobelprize.org/datasets/dcat. Some modifications were made in order to clarify the guideline.

<rdf:Description rdf:about="https://dcat-editor.com/store/17/resource/6">

    <ns1:downloadURL rdf:resource="http://api.nobelprize.org/v1/laureate.csv"/>

    <rdf:type rdf:resource="http://www.w3.org/ns/dcat#Distribution"/>

    <ns1:accessURL rdf:resource="http://api.nobelprize.org/v1/laureate.csv"/>

</rdf:Description>