Skip to main content

-BIG DATA, OPEN DATA AND PUBLIC SECTOR INFORMATION

(A.) Policy and legislation

(A.1) Policy objectives

With the continuously growing amount of data (often referred to as ‘big data’) and the increasing amount of open data, interoperability is increasingly a key issue in exploiting the value of this data.

Standardisation at different levels (such as metadata schemata, data representation formats and licensing conditions of open data) is essential to enable broad data integration, data exchange and interoperability with the overall goal of fostering innovation based on data. This refers to all types of (multilingual) data, including both structured and unstructured data, and data from different domains as diverse as geospatial data, statistical data, weather data, public sector information (PSI) and research data (see also the rolling plan contribution on ‘e-Infrastructures for data and computing-intensive science’), to name just a few.

Overall, the application of standard and shared formats and protocols for gathering and processing data from different sources in a coherent and interoperable manner across sectors and vertical markets should be encouraged, for example in R&D&I projects and in the EU open data portal (https://data.europa.eu/euodp) and the European data portal (https://data.europa.eu/europeandataportal).

Studies conducted for the European Commission showed that businesses and citizens were facing difficulties in finding and re-using public sector information. The Communication on Open data states that “the availability of the information in a machine-readable format and a thin layer of commonly agreed metadata could facilitate data cross-reference and interoperability and therefore considerably enhance its value for reuse”.

A common standard for the referencing of open data in the European open data portals would be useful. A candidate for a common standard in this area is the Application Profile for data portals in Europe (DCAT) and the FIWARE open stack-based specification and open standards APIs. The FIWARE solution has now been integrated into the Connecting Europe Facility “Context Broker” building block (https://ec.europa.eu/cefdigital/wiki/display/CEFDIGITAL/Context+Broker). The CEF has agreed meanwhile to upgrade the “Context Broker” to use the ETSI NGSI-LD specification (ETSI GS 009 V1.3.1 of the NGSI-LD API), and also the FIWARE Foundation is evolving its API to the same ETSI standard for exchange of open data. Now further effort is needed to demonstrate good examples of proper usage of NGSI-LD. This has been promoted within the EC  Large Scale Pilot project SynchroniCity, however more dissemination and training is required (as recognized by CEF efforts to promote training webinars). 

The DCAT Application Profile has been developed as a common project from the ISA2 programme, the Publications Office (PO) and CNECT to describe public-sector data catalogues and datasets and to promote the specification to be used by data portals across Europe. Agreeing on a common application profile and promoting this among the Member States is substantially improving the interoperability among data catalogues and the data exchange between Member States. The DCAT-AP is the specification used by the European Data Portal, which is part of the Connecting Europe Facility infrastructure, as well as by a growing number of Member States open data portals. The DCAT-AP related work, including its extensions to geospatial data (GeoDCAT-AP) and statistical data (StatDCAT-AP) also highlights the need for further work on the core standard. These are topics for the W3C smart descriptions & smarter vocabularies (SDSVoc) under the VRE4EIC Project https://www.w3.org/2016/11/sdsvoc/. Core Vocabularies (i.e., Core Person, Core Organization, Core Location, Core Public Event, Core Criterion and Core  Evidence), Core Public Service Application Profile and Asset Description Metadata Schema (for describing reusable solutions), implemented by the ISA2 program, solve the problem of data exchange and interoperability by using uniform data representation formats. They are currently used in the TOOP-OOP (Once-Only Principle) project which acts as forerunner for Single Digital Gateway Regulation.

The concept of the Once-Only Principle (OOP) focuses on reducing administrative burden for individuals and businesses by re-organising public sector internal processes, instead of making citizens and business users adjust to existing procedures. In view of its contribution to the realisation of the Digital Single Market in Europe, the European Commission is strongly promoting the implementation of the OOP across borders. Therefore, once-only is one of the underlying principles stated in the European Union’s “eGovernment Action Plan 2016-2020” and is part of several initiatives related to the European Digital Single Market. This includes the following three pilot projects: SCOOP4CThe Once-Only Principle Project (TOOP), and Digital Europe for all (DE4A).

Furthermore, the Single Digital Gateway Regulation EU 2018/1724 includes a technical system for exchange of evidences based on OOP concepts and input from the TOOP project. This system will be supported by a new CEF Once Only Principle Building Block.

The mapping of existing relevant standards for a number of big data areas would be beneficial. Moreover, it might be useful to identify European clusters of industries that are with sufficiently similar activities to develop data standards. Especially for open data, the topics of data provenance and licensing (for example the potential of machine-readable licences) need to be addressed, as encouraged in the current and proposed revision of the PSI Directive (see section B.1).

The PSI Directive encourages the use of standard licences which must be available in digital format and be processed electronically (Article 8(2)). Furthermore, the Directive encourages the use of open licences available online, which should eventually become common practice across the EU (Recital 26). In addition, to help Member States transpose the revised provisions, the Commission adopted guidelines11 which recommend the use of such standard open licences for the reuse of PSI.

Currently, ISA2 vocabularies and solutions are used for the implementation of Single Digital Gateway Regulation (SDGR) in the design of common data models for evidences that are going to be exchanged between Member States. SDGR highlights the need to insure functional, technical and semantic interoperability in the exchange of evidences which can only be assured by using standard and shared data representation formats.

On 25 April 2018, the Commission adopted the ‘data package’ — a set of measures to improve the availability and re-usability of data, in particular publicly held or publicly funded data, including government data and publicly funded research results, and to foster data sharing in business-to-business (B2B) and business-to-government (B2G) settings. The availability of data is essential so that companies can leverage on the potential of data-driven innovation or develop solutions using artificial intelligence.

Key elements of the package are:

1. The adoption of the Directive on open data and the re-use of public sector information (recast of Directive 2003/98/EC amended by Directive 2013/37/EU)

  • enhancing access to and re-use of real-time data notably with the help of Application Programming Interfaces (APIs);
  • lowering charges for the re-use of public sector information by limiting exceptions to the default upper limit of marginal cost of dissemination and by specifying certain high-value data sets which should be made available for free (via implementing acts);
  • allowing for the re-use of new types of data, including data held by public undertakings in the transport and utilities sector and data resulting from publicly funded research;
  • minimising the risk of excessive first-mover advantage in regard to certain data, which could benefit large companies and thereby limit the number of potential re-users of the data in question;
  • defining through an implementing act a list of “high value datasets” belonging to six thematic categories (geospatial, earth observation and environment, meteorological, statistics, companies and company ownership, mobility) to be made available mandatorily free of charge, in machine readable format and through APIs.

The new Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information has been adopted and published on 26 June 2019. It must be transposed into national legislation by 17 July 2021

Article 14 of the Open data Directive empowers the Commission to adopt implementing acts laying down a list of specific high value datasets belonging to the six thematic categories set out in the Annex and held by public sector bodies and public undertakings. In order to make the reuse of these datasets more efficient, the Directive provides that they shall be available for free, machine-readable, provided via APIs and, where relevant, as a bulk download. The implementing acts may also specify the arrangements for the publication and re-use of high value datasets, which shall be compatible with open standard licences. They may include terms applicable to re-use, formats of data and metadata and technical arrangements for dissemination. Work on the definition of high value datasets, including an impact assessment study, stakeholder consultations and a public consultation and hearing, took place in 2020. An implementing regulation on high value datasets will be adopted in the first half of 2021.

2. Review of the 2012 Recommendation on access to and preservation of scientific information, focusing on:

  • evaluating the uptake of the 2012 Recommendation as well as its effectiveness in creating a level playing field for Member States, researchers and academic institutions;
  • updating and reinforcing the overall policy with the development of guidelines on opening up research data and the creation of incentive schemes for researchers sharing data;
  • ensuring coherence with the European Open Science Cloud.

3. Development of guidance on private sector data sharing

The Commission has proposed guidance to companies that wish to make data available to other companies or to public authorities, which lays down principles of fair data sharing practices and includes guidance on legal, business and technical aspects of B2B and B2G data sharing.

Following an open selection process, the Commission appointed in November 2018 23 experts to an Expert Group on Business-to-Government Data Sharing. The conclusions and recommendations of the Expert Group were published in a report released end of 2019; the report findings are used as input for possible future Commission initiatives on B2G data sharing (https://ec.europa.eu/digital-single-market/en/news/experts-say-privately-held-data-available-european-union-should-be-used-better-and-more).

The report underlines several times the importance of standardisation for facilitating data sharing. “The expert group recommends that the Digital Europe Programme invests in the development of common standards for data, metadata, representation and standardised transfer protocols. Building on existing EU programmes, initiatives and working groups, such as the CEF and ISA2 programmes and the “Multi-stakeholder platform for ICT standardisation”, the expert group recommends prioritising those standards that are most generally used over creating new ones. The chosen standards should then be further developed, possibly in cooperation and with the support of a European standardisation body. Agreeing on a (set of) common standard(s) and promoting this among the Member States will substantially improve the interoperability among data catalogues and the data exchange between Member States and private companies and civil-society organisations.”

On 19 February 2020, the Commission adopted the Communication on “A European strategy for data” — a set of measures aiming at making the EU a leader in a data-driven society. Creating a single market for data will allow it to flow freely within the EU and across sectors for the benefit of businesses, researchers and public administrations.

A first priority for operationalising its vision is to put in place an enabling legislative framework for the governance of common European data spaces. Such governance structures should support decisions on what data can be used in which situations, facilitate cross-border data use, and prioritise interoperability requirements and standards within and across sectors, while taking into account the need for sectoral authorities to specify sectoral requirements. The framework will reinforce the necessary structures in the Member States and at EU level to facilitate the use of data for innovative business ideas, both at sector- or domain-specific level and from a cross-sector perspective.

On 25 November 2020 the Commission adopted the proposal for a Regulation on European data governance. (COM(2020) 767 final).It aims to foster the availability of data for use by increasing trust in data intermediaries and by strengthening data-sharing mechanisms across the EU. The Regulation will facilitate data sharing across the EU and between sectors to create wealth for society, increase control and trust of both citizens and companies regarding their data, and offer an alternative European model to data handling practice of major tech platforms.

(A.3) References 
  • Proposal for Regulation on European data governance Data Governance Act, COM(2020) 767 final
  • Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information (recast)
  • Regulation (EU) 2018/1807 of the European Parliament and of the Council of 14 November 2018 on a framework for the free flow of non-personal data in the European Union
  • COM(2020) 66 final “A European strategy for data”
  • COM(2018) 232 final
  • COM(2014) 442 Towards a thriving data-driven economy
  • COM(2016) 176 ICT Standardisation Priorities for the Digital Single Market
  • COM(2017) 9 final Building a European Data Economy: A Communication on Building a European Data Economy was adopted on 10 January 2017. This Communication explores the following issues: free flow of data; access and transfer in relation to machine generated data; liability and safety in the context of emerging technologies; and portability of non-personal data, interoperability and standards. Together with the Communication the Commission has launched a public consultation.
  • Decision (EU) 2015/2240 on interoperability solutions and common frameworks for European public administrations, businesses and citizens (ISA2 programme) as a means for modernising the public sector (ISA2)
  • The PSI Directive (2013/37/EU) on the re-use of public sector information (Public Sector Information Directive) was published in the Official Journal on 27 June 2013. The Directive requests to make available for reuse PSI by default, preferably in machine-readable formats. All Member States transposed it into national legislation. 
  • COM(2011) 882 on Open data
  • COM(2011) 833 on the reuse of Commission documents
  • COM(2015)192 “A Digital single market strategy for Europe”
  • COM(2018)234 “Proposal for a Directive on the re-use of public sector information (recast)
  • C(2018) 2375 final “Recommendation on access to and preservation of scientific information”

(B.) Requested actions

The Communication on ICT Standardisation Priorities for the Digital Single Market proposes priority actions in the domain of Big Data. Actions mentioned herein below reflect some of them.

Action 1 Invite the CEN to support and assist the DCAT-AP standardisation process. DCAT-AP contains specifications for metadata records to meet the specific application needs of data portals in Europe while providing semantic interoperability with other applications on the basis of reuse of established controlled vocabularies (e.g. EuroVoc12) and mappings to existing metadata vocabularies (e.g. SDMX, INSPIRE metadata, Dublin Core, etc.). DCAT-AP and its extensions have been developed by multi-sectorial expert groups. Experts from international standardisation organisations participated in the group together with open data portal owners to ensure the interoperability of the resulting specification and to assist in its standardisation. These mappings have provided already a DCAT-AP extension to cover geospatial datasets, called Geo/DCAT-AP. The specification was developed under the coordination of the JRC team working on the implementation of the INSPIRE Directive. Another extension to describe statistical datasets, called Stat/DCAT-AP13, was published end 2016.  This work has been coordinated by EUROSTAT and the Publications Office.

Action 2 Promote standardisation in/via the open data infrastructure, especially the European Data Portal being deployed in 2015-2020 as part of the digital service infrastructure under the Connecting Europe Facility programme,

Action 3 Support of standardisation activities at different levels: H2020 R&D&I activities; support for internationalisation of standardisation, in particular for the DCAT-AP specifications developed in the ISA2 programme (see also action 2 under eGovernment chapter), and for specifications developed under the Future Internet public-private-partnership, such as FIWARE NGSI-LD and FIWARE CKAN. Standardisation can also be enhanced by using Core Vocabularies, as well as Core Public Service Application Profile implemented by the ISA2 program; new activities launched by the first implementations of the Digital Europe Programme and the legal framework progressively put in place following the Commission Communication on “A European strategy for data”.

Action 4  Bring the European data community together, including through the H2020 Big Data Value public-private partnership14, to identify missing standards and design options for a big data reference architecture, taking into account existing international approaches, in particular the work in ISO/IEC JTC 1 SC 42. In general attention should be given to the four pillars of (semantic) discovery, privacy-by-design, accountability for data usage (licensing), and exchange of data together with its metadata, through the use of Asset Description Metadata Schema (for describing reusable solutions) implemented by the ISA2 program.

Action 5 CEN to coordinate with the relevant W3C groups on preventing incompatible changes and on the conditions for availability of the standard(s), to standardise the DCAT-AP, as well as the other vocabularies provided by the ISA2 program.

Action 6 The European Commission together with EU funded pilots and projects that develop technical specifications for the provision of cross-border services (e.g., from  ISA², CEF/DEP pilots), which need to be referenced in public procurement, to liaise with SDOs to consider how to address their possible standardisation.

Action 7 The European Commission to initiate broad exchanges with SDOs and all stakeholders, in particular with those in relation with the European Data Spaces, on the role of standards and open source in the context of the improvement of data interoperability within and across sectors, the data economy including in particular consideration of the twin transitions. This should inter alia aim for the identification of the possible functional or standardisation gaps and promote further coordination amongst different SDOs

Action 8 SDOs to look into possible standardisation needs arising from the proposed Regulation on European Data Governance.

(C.) Activities and additional information  

(C.1) Related standardisation activities 
CEN CENELEC

The CEN-CLC/Focus Group on Artificial Intelligence (AI) has been set up to mirror relevant international standardisation activities in ISO and IEC and to address specific European needs. The Focus Group also addresses Big Data. In this context, the Focus Group has published a response to the EC White Paper on AI and the CEN and CENELEC roadmap for AI standardisation (https://www.cencenelec.eu/standards/Topics/Documents/CEN-CLC%20AI%20FG_White%20Paper%20Response_Final%20Version_June%202020.pdf).

The CEN/WS (Workshop) ISAEN “Unique Identifier for Personal Data Usage Control in Big Data” seeks to operationalize the bourgeoning policy initiatives related to big data, in particular in relation to personal data management and the protection of individuals’ fundamental rights. The unique identifier described in the CEN/CWA serves as a measurement tool to empower individuals, help them take control of their data, and make their fundamental right to privacy more actionable.

ETSI

ETSI’s oneM2M Partnership Project has specified the oneM2M Base Ontology (oneM2M TS-0012, ETSI TS 118 112) to enable syntactic and semantic interoperability for IoT data.

ETSI TC SmartM2M is developing a set of reference ontologies, mapped onto the oneM2M Base Ontology. This work has commenced with the SAREF ontology, for Smart Appliances, but is being extended to add semantic models for data associated with smart cities, industry and manufacturing, smart agriculture and the food chain, water, automotive, eHealth/aging well and wearables.

ETSI’s ISG for cross-cutting Context Information Management (CIM) has developed the NGSI-LD API (GS CIM 004 and GS CIM 009) which builds upon the work done by OMA Specworks and FIWARE. NGSI-LD is an open framework for exchange of contextual information for smart services, aligned with best practice in linked open data. Ongoing activities involve increased interoperability with oneM2M data sources and features to attest provenance of information as well as options for fine-grained encryption of information.

ETSI’s ISG MEC is developing a set of standardized Application Programming Interfaces (APIs) for Multi-Access Edge Computing (MEC). MEC technology offers IT service and Cloud computing capabilities at the edge of the network. Shifting processing power away from remote data centres and closer to the end user, it enables an environment that is characterised by proximity and ultra-low latency, and provides exposure to real-time network and context information.

ETSI’s TC ATTM committee has specified a set of KPIs for energy management for data centres (ETSI ES 205 200-2-1). These have been combined into a single global KPI for data centres, called DCEM, by ETSI’s ISG on Operational energy Efficiency for Users (OEU), in ETSI GS OEU 001.

SC USER: has produced a set of documents related to “User-Centric approach in the digital ecosystem”. Note: this body of work also applies to several other sections of the ICT rolling plan, such as, IoT, eHealth, Cyber security, e-privacy, accessibility, but are documented only once. 

ETSI TR 103 438 User Group; User centric approach in Digital Ecosystem

ETSI EG 203 602 User Group; User Centric Approach: Guidance for users; Best practices to interact in the Digital Ecosystem

ETSI TR 103 603 User Group; User Centric Approach; Guidance for providers and standardisation makers

ETSI TR 103 604 User Group; User centric approach; Qualification of the interaction with the digital ecosystem

ETSI TR 103 437 Quality of ICT services; New QoS approach in a digital ecosystem Publication expected by End September 2020 SC USER plans to finalize the project by defining and implementing a proof of Concept of a “Smart interface for digital ecosystem”, which is a user interface that meets the needs and expectations of the user at his request, and is an “Intelligent”, “highly contextualized” personalization, agile and proactive interface with an integrated QoS.

ISO/IEC JTC1

In 2018 JTC 1/SC 42 was formed, WG 2 is responsible for the Big Data work program.

SC 42 has published the following published big data standards:

•ISO/IEC 20546:2019 Information technology -- Big Data -- Overview and Vocabulary (https://www.iso.org/standard/68305.html?browse=tc)

•ISO/IEC TR 20547-2:2018 Information technology -- Big data reference architecture -- Part 2: Use cases and derived requirements (https://www.iso.org/standard/71276.html?browse=tc)

•ISO/IEC TR 20547-5:2018 Information technology -- Big data reference architecture -- Part 5: Standards roadmap (https://www.iso.org/standard/72826.html?browse=tc)

SC 42 is progressing the following current big data projects, which are expected to complete in the next year:

•ISO/IEC 20547-1: Information technology -- Big Data reference architecture -- Part 1: Framework and application process

•ISO/IEC 20547-3: Information technology -- Big Data reference architecture -- Part 3: Reference architecture

•ISO/IEC 24688: Information technology -- Artificial Intelligence -- Process management framework for Big data analytics

Built on its foundation standard that is ISO/IEC 38500 (Information technology - Governance of IT for the Organization), JTC 1/SC 40 has developed or is developing the following standards on Governance of Data:

•38505-1: Information technology - Governance of IT - Part 1: Application of ISO/IEC 38500 to the governance of data

•38505-2: Information technology - Governance of IT - Part2: Implications of ISO/IEC38505-1 for Data Management

•38505-3: Information technology - Governance of Data - Part3: Guidelines for Data Classification

See for further information https://www.iso.org/committee/5013818.html

ISO/IEC JTC1 SC32 on “Data management and interchange” works on standards for data management within and among local and distributed information systems environments. SC 32 provides enabling technologies to promote harmonization of data management facilities across sector-specific areas. https://www.iso.org/committee/45342.html

ITU-T

TU-T SG13 Recommendation ITU-T Y.3600 “Big data - Cloud computing based requirements and capabilities” covers use-cases of cloud computing based big data to collect, store, analyse, visualize and manage varieties of large volume datasets: https://www.itu.int/rec/T-REC-Y.3600/en

Also, SG13 published Y.3600-series Supplement 40 “Big Data Standardisation Roadmap” which will be revised in 2022: https://www.itu.int/rec/T-REC-Y.Sup40/en

SG13 is working on big data functional requirements for data integration (Y.bdi-reqts) and functional architecture  of big data-driven networking (Y.bDDN-FunArch) .

Recently approved ITU-T Recommendations on big data, includes Y.3605 (09/2020) with big data reference architecture.

See a flipbook “Big Data - Concept and application for telecommunications”: https://www.itu.int/en/publications/Documents/tsb/2019-Big-data/mobile/index.html.

Work Programme of SG13 is available at: http://itu.int/itu-t/workprog/wp_search.aspx?sg=13

More info: https://www.itu.int/en/ITU-T/studygroups/2017-2020/13

ITU-T SG20 “Internet of things (IoT) and smart cities & communities (SC&C)” is studying big data aspects of IoT and SC&C. Recommendation ITU-T Y.4114 “Specific requirements and capabilities of the IoT for big data” complements the developments on common requirements of the IoT described in Recommendation ITU-T Y.4100/Y.2066 and the functional framework and capabilities of the IoT described in Recommendation ITU-T Y.4401/ Y.2068 in terms of the specific requirements and capabilities that the IoT is expected to support in order to address the challenges related to big data. This Recommendation also constitutes a basis for further standardisation work such as functional entities, application programming interfaces (APIs) and protocols concerning big data in the IoT.

ITU-T SG20 also published Recommendation ITU-T Y.4461 “Framework of open data in smart cities” that clarifies the concept, analyses the benefits, identifies the key phases, roles and activities and describes the framework and general requirements of open data in smart cities. 

Work programme of SG20 is available at: https://www.itu.int/ITU-T/workprog/wp_search.aspx?sg=20

More info: https://itu.int/go/tsg20

The ITU-T Focus Group on Data Processing and Management (FG-DPM) to support IoT and Smart Cities & Communities was set up in 2017. The Focus Group played a role in providing a platform to share views, to develop a series of deliverables, and showcasing initiatives, projects, and standards activities linked to data processing and management and establishment of IoT ecosystem solutions for data focused cities. This Focus Group concluded its work in July 2019 with the development of 10 Technical Specifications and 5 Technical reports. The complete list of deliverables is available here

 https://itu.int/en/ITU-T/focusgroups/dpm

ITU-T SG 17 has approved several standards on big data and open data including Recommendations ITU-T X.1147 “Security requirements and framework for big data analytics in mobile internet services” and ITU-T X.1603 “Data security requirements for the monitoring service of cloud computing” and in approving Recommendations ITU-T X.1750 “  Guidelines on security of big data as a service for Big Data Service Providers” and ITU-T X.1751 “Security guidelines on big data lifecycle management for telecom operators” (X.sgtBD) and ITU-T X.1376 “Security-related misbehaviour detection mechanism based on big data analysis for connected vehicles” along with on-going standardisation relating to “Security guidelines for big data infrastructure and platform” (X.sgBDIP) .

More info: https://www.itu.int/en/ITU-T/studygroups/2017-2020/17

IEEE

IEEE has a series of standards projects related to Big Data (mobile health, energy efficient processing, personal agency and privacy) as well as pre-standardisation activities on Big Data and open data:

https://ieeesa.io/rp-open-big-data

OASIS

The OASIS Open Data Protocol (Odata) TC works to simplify the querying and sharing of data across disparate applications and multiple stakeholders for re-use in the enterprise, Cloud, and mobile devices. A REST-based protocol, OData builds on HTTP and JSON using URIs to address and access data feed resources. OASIS OData standards have been approved as ISO/IEC 20802-1:2016 and ISO/IEC 20802-2:2016

The OASIS ebCore TC maintains the ebXML RegRep standard that defines the service interfaces, protocols and information model for an integrated registry and repository. The repository stores digital content while the registry stores metadata that describes the content in the repository. RegRep is used in the EU TOOP project.

OGC

The Open Geospatial Consortium (OGC) defines and maintains standards for location-based, spatio-temporal data and services. The work includes, for instance, schema allowing description of spatio-temporal sensor, image, simulation, and statistics data (such as “datacubes”), a modular suite of standards for Web services allowing ingestion, extraction, fusion, and (with the web coverage processing service (WCPS) component standard) analytics of massive spatio-temporal data like satellite and climate archives. OGC also contributes to the INSPIRE project. http://www.opengeospatial.org

W3C

DCAT vocabulary (done in the linked government data W3C working group) 

http://www.w3.org/TR/vocab-dcat/

After a successful Workshop on Smart Descriptions & Smarter Vocabularies (SDSVoc) (www.w3.org/2016/11/sdsvoc/) W3C created the Dataset Exchange Working Group (https://www.w3.org/2017/dxwg) to revise DCAT, provide a test suite for content negotiation by application profile and to develop additional relevant vocabularies in response to community demand. 

Work on licence in ODRL continues and has reached a very mature state: https://www.w3.org/TR/odrl-model/ and https://www.w3.org/TR/vocab-odrl/

The Data on the web best practices WG has finished its work successfully (https://www.w3.org/TR/dwbp) also issuing data quality, data usage vocabularies (https://www.w3.org/TR/vocab-dqv; https://www.w3.org/TR/vocab-duv)

Other activities related to standardisation

ISA and ISA2 programme of the European Commission

The DCAT application profile (DCAT-AP) has been defined. DCAT-AP is a specification based on DCAT (a RDF vocabulary designed to facilitate interoperability between data catalogues published on the web) to enable interoperability between data portals, for example to allow metasearches in the European Data Portal that harvests data from national open data portals.

Extensions of the DCAT-AP to spatial (GeoDCAT-AP: https://joinup.ec.europa.eu/node/139283 ) and statistical information (StatDCAT-AP: https://joinup.ec.europa.eu/asset/stat_dcat_application_profile/home ) have also been developed.

https://joinup.ec.europa.eu/asset/dcat_application_profile/description

https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-ap-v11

Core Vocabularies can be used and extended in the following contexts:

  • Development of new systems: the Core Vocabularies can be used as a default starting point for designing the conceptual and logical data models in newly developed information systems.
  • Information exchange between systems: the Core Vocabularies can become the basis of a context-specific data model used to exchange data among existing information systems.
  • Data integration: the Core Vocabularies can be used to integrate data that comes from disparate data sources and create a data mesh-up.
  • Open data publishing: the Core Vocabularies can be used as the foundation of a common export format for data in base registries like cadastres, business registers and public service portals. The Core Public Service Vocabulary Application Profile allows harmonised ways and common data models to represent life events, business events and public services across borders and across-sectors to facilitate access.

ADMS is a standardised vocabulary which aims at helping publishers of semantic assets to document what their assets are about (their name, their status, theme, version, etc) and where they can be found on the Web. ADMS descriptions can then be published on different websites while the asset itself remains on the website of its publisher.

More info can be found in the following links:

https://joinup.ec.europa.eu/collection/semantic-interoperability-community-semic/core-vocabularies

https://ec.europa.eu/isa2/solutions/core-public-service-vocabulary-application-profile-cpsv-ap_en

https://joinup.ec.europa.eu/collection/semantic-interoperability-community-semic/adms

CEF

Under the framework of the Connecting Europe Facility programme support to the interoperability of metadata and data at national and EU level is being developed through dedicated calls for proposals. The CEF group is also promoting training and webinars for using the “context broker”, in collaboration as appropriate with the NGSI-LD standards group ETSI ISG CIM.

AquaSmart

AquaSmart enables aquaculture companies to perform data mining at the local level and get actionable results.

The project contributes to standardisation of open data in aquaculture. Results are exploited through the Aquaknowhow business portal. www.aquaknowhow.com

Automat

The main objective of the AutoMat project is to establish a novel and open ecosystem in the form of a cross-border Vehicle Big Data Marketplace that leverages currently unused information gathered from a large amount of vehicles from various brands.

This project has contributed to standardisation of brand-independent vehicle data. www.automat-project.eu

BodyPass

BodyPass aims to break barriers between health sector and consumer goods sector and eliminate the current data silos.

The main objective of BodyPass is to foster exchange, linking and re-use, as well as to integrate 3D data assets from the two sectors. For this, BodyPass adapts and creates tools that allow a secure exchange of information between data owners, companies and subjects (patients and customers).

The project aims at standardizing 3D data www.bodypass.eu

EU Commission

A smart open data project by DG ENV led directly to the establishment of the Spatial Data on the Web Working group, a collaboration between W3C and the OGC.

G8 Open Data Charter

In 2013, the EU endorsed the G8 Open Data Charter and, with other G8 members, committed to implementing a number of open data activities in the G8 members’ collective action plan (publication of core and high-quality datasets held at EU level, publication of data on the EU open data portal and the sharing of experiences of open data work).

Future Internet Public Private Partnership programme

Specifications developed under the Future Internet public-private-partnership programme (FP7):

FIWARE NGSI extends the OMA Specworks NGSI API for context information management that provides a lightweight and simple means to gather, publish, query and subscribe to context information. FIWARE NGSI can be used for real-time open data management. ETSI’s ISG for cross-cutting Context Information Management (CIM) has developed the NGSI-LD API (GS CIM 004 and GS CIM 009) which builds upon the work done by OMA Specworks and FIWARE. The latest FIWARE software implements the newest ETSI NGSI-LD specification.

FIWARE CKAN: Open Data publication Generic Enabler. FIWARE CKAN is an open source solution for the WG10 publication, management and consumption of open data, usually, but not only, through static datasets. FIWARE CKAN allows its users to catalogue, upload and manage open datasets and data sources. It supports searching, browsing, visualising and accessing open data

Big Data Value cPPP TF6 SG6 on big data standardisation

In the big data value contractual public-private-partnership, a dedicated subgroup (SG6) of Task Force 6: Technical deals with big data standardisation.

11 http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=uriserv:OJ.C_.2014.240.01.0001.01.ENG

12 https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/eurovoc

13 https://joinup.ec.europa.eu/asset/stat_dcat_application_profile/home

14 http://www.bdva.eu/