Skip to main content

Big data, open data and public sector information (RP2023)

(A.) Policy and legislation

(A.1) Policy objectives

With the continuously growing amount of data (including open data), interoperability is increasingly a key issue in exploiting the value of big data. Interoperability is essential so that cooperation, development, integration and the rendering of services takes place in the best possible way. It also significantly facilitates the accomplishment of public policies, especially for the collaboration between different applications to enable the development of new services. It therefore allows for the development of e-governance and the information society.

In its Data Strategy ( COM (2020) 66 final), the Commission described the vision of a common European data space, a Single Market for data in which data could be used irrespective of its physical location of storage in the Union in compliance with applicable law. In order to turn that vision into reality, EU proposes to establish domain-specific common European data spaces, as the concrete arrangements in which data sharing and data pooling can happen. As foreseen in that strategy, such common European data spaces can cover areas such as health, mobility, manufacturing, financial services, energy, or agriculture or thematic areas, such as the European green deal or European data spaces for public administration or skills. In the same document, emphasis is given to data interoperability and quality. More specifically, data interoperability and quality, as well as their structure, authenticity and integrity are key for the exploitation of the data value, especially in the context of AI deployment. Further, under the Digital Europe Programme, a Data Spaces Support Centre will be set up, with the purpose of coordinating all relevant actions on sectorial data spaces and to make available technologies, processes, standards and tools that will allow reuse of data across sectors by the public sector and European businesses, notably SMEs.

In November 2020, the proposal for Regulation on European data governance (Data Governance Act) was published. In this proposal, the need to enhance data interoperability as well as data sharing services between different sectors and domains to act as an enabler to seamless and secure cross-border electronic communication, building on existing European, international or national standards is highlighted. The setting up of European Data Innovation Board is proposed - in the form of a Commission expert group - in order to successfully implement the data governance framework. This Board should support the Commission in coordinating national practices and policies on the topics covered by this Regulation, and in supporting cross-sector data use by adhering to the European Interoperability Framework (EIF) principles and through the utilisation of standards and specifications (such as the Core Vocabularies and the CEF Building Blocks), without prejudice to standardisation work taking place in specific sectors or domains. Among the Board’s tasks the following are foreseen: (a) to advise the Commission on the prioritisation of cross-sector standards to be used and developed for data use and cross-sector data sharing, cross-sectoral comparison and exchange of best practices with regards to sectoral requirements for security, access procedures, while taking into account sector-specific standardisations activities; (b) to assist the Commission in enhancing the interoperability of data as well as data sharing services between different sectors and domains, building on existing European, international or national standards.

Standardisation at different levels (such as metadata schemata, data representation formats and licensing conditions of open data) is essential to enable broad data integration, data exchange and interoperability with the overall goal of fostering innovation based on data. This refers to all types of (multilingual) data, including both structured and unstructured data, and data from different domains as diverse as geospatial data, statistical data, weather data, public sector information (PSI) and research data, to name just a few.

(A.2) EC perspective and progress report

Overall, the application of standard and shared formats and protocols for gathering and processing data from different sources in a coherent and interoperable manner across sectors and vertical markets should be encouraged, for example in R&D&I projects and in the data.europa.eu portal (https://data.europa.eu)

Studies conducted for the European Commission showed that businesses and citizens were facing difficulties in finding and re-using public sector information. The Communication on Open data states that "the availability of the information in a machine-readable format and a thin layer of commonly agreed metadata could facilitate data cross-reference and interoperability and therefore considerably enhance its value for reuse".

Open public sector data should be Findable, Accessible, Interoperable and Reusable (FAIR).  FAIR data principles (https://www.go-fair.org/fair-principles/) originating from the research community should be used as a guide to identify the standardisation needs. FAIR data principles are not standards themselves, but rather provide a set of criteria against which standards can be provided to make data Findable, Accessible, Interoperable and Reusable.

A common standard for the referencing of open data in the European open data portals would be useful. A candidate for a common standard in this area is the Application Profile for data portals in Europe (DCAT-AP) and the FIWARE (https://www.fiware.org/) open stack-based specification and open standards APIs. The FIWARE solution has now been integrated into the Connecting Europe Facility “Context Broker” building block (https://ec.europa.eu/cefdigital/wiki/display/CEFDIGITAL/Context+Broker). The CEF has agreed meanwhile to upgrade the “Context Broker” to use the ETSI NGSI-LD specification (ETSI GS 009 V1.3.1 of the NGSI-LD API), and also the FIWARE Foundation is evolving its API to the same ETSI standard for exchange of open data. Now further effort is needed to demonstrate good examples of proper usage of NGSI-LD. This has been promoted within the EC Large Scale Pilot project SynchroniCity, however more dissemination and training is required (as recognized by CEF efforts to promote training webinars).

The DCAT Application Profile is being developed as a common project from the former ISA2 programme, currently Interoperable Europe(https://joinup.ec.europa.eu/collection/interoperable-europe/interoperable-europe), the Publications Office (PO) and CNECT to describe public-sector data catalogues and datasets and to promote the specification to be used by data portals across Europe. Agreeing on a common application profile and promoting this among the Member States is substantially improving the interoperability among data catalogues and the data exchange between Member States. DCAT-AP is one of the standards which could make public sector data FAIR. DCAT-AP (or if needed, domain specific extension of it) can also be used in forthcoming data spaces such as the European Health Data Space, to ensure data interoperability and quality.  The DCAT-AP is the specification used by the data.europa.eu portal, which is part of the Connecting Europe Facility infrastructure, as well as by a growing number of Member States open data portals. The DCAT-AP related work, including its extensions to geospatial data (GeoDCAT-AP) and statistical data (StatDCAT-AP) also highlights the need for further work on the core standard. These are topics for the W3C smart descriptions & smarter vocabularies (SDSVoc) under the VRE4EIC Project https://www.w3.org/2016/11/sdsvoc/. Core Vocabularies (i.e., Core Person, Core Organization, Core Location, Core Public Event, Core Criterion and Core  Evidence), Core Public Service Application Profile and Asset Description Metadata Schema (for describing reusable solutions), implemented by the former ISA2 programme, currently Interoperable Europe, solve the problem of data exchange and interoperability by using uniform data representation formats. They were used in the TOOP-OOP (Once-Only Principle) project and now are currently used in the Once Only Principle (OOP) Technical System under the scope of the Single Digital Gateway Regulation EU 2018/1724.

The concept of the Once-Only Principle (OOP) focuses on reducing administrative burden for individuals and businesses by re-organising public sector internal processes, instead of making citizens and business users adjust to existing procedures. In view of its contribution to the realisation of the Digital Single Market in Europe, the European Commission is strongly promoting the implementation of the OOP across borders. Therefore, once-only is one of the underlying principles stated in the European Union’s “eGovernment Action Plan 2016-2020” and is part of several initiatives related to the European Digital Single Market. This includes the following three pilot projects: SCOOP4C, The Once-Only Principle Project (TOOP), and Digital Europe for all (DE4A).

Furthermore, the Single Digital Gateway Regulation EU 2018/1724 includes a technical system for exchange of evidences based on OOP concepts and input from the TOOP project. This system will be supported by a new CEF Once Only Principle Building Block.

The mapping of existing relevant standards for a number of big data areas would be beneficial. Moreover, it might be useful to identify European clusters of industries that are with sufficiently similar activities to develop data standards. Especially for open data, the topics of data provenance and licensing (for example the potential of machine-readable licenses) need to be addressed, as encouraged in the current and proposed revision of the PSI Directive (see section B.1).

The new Open Data Directive encourages the use of standard licenses which must be available in digital format and be processed electronically (Article 8(2)). Furthermore, the Directive encourages the use of open licenses available online, which should eventually become common practice across the EU (Recital 44). In addition, to help Member States transpose the revised provisions, the Commission adopted guidelines (http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=uriserv:OJ.C_.2014.240.01.0001.01.ENG) which recommend the use of such standard open licenses for the reuse of PSI.

Currently, Interoperable Europe vocabularies and solutions are used for the implementation of Single Digital Gateway Regulation (SDGR) in the design of common data models for evidences that are going to be exchanged between Member States. SDGR highlights the need to insure functional, technical and semantic interoperability in the exchange of evidences which can only be assured by using standard and shared data representation formats.

On 25 April 2018, the Commission adopted the 'data package' — a set of measures to improve the availability and re-usability of data, in particular publicly held or publicly funded data, including government data and publicly funded research results, and to foster data sharing in business-to-business (B2B) and business-to-government (B2G) settings. The availability of data is essential so that companies can leverage on the potential of data-driven innovation or develop solutions using artificial intelligence.

Key elements of the package are:

1.  The Open Data Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information was adopted and published on 26 June 2019. It must be transposed into national legislation by 17 July 2021

  • enhancing access to and re-use of real-time data notably with the help of Application Programming Interfaces (APIs);
  • lowering charges for the re-use of public sector information by limiting exceptions to the default upper limit of marginal cost of dissemination and by specifying certain high-value data sets which should be made available for free (via implementing acts);
  • allowing for the re-use of new types of data, including data held by public undertakings in the transport and utilities sector and data resulting from publicly funded research;
  • minimising the risk of excessive first-mover advantage in regard to certain data, which could benefit large companies and thereby limit the number of potential re-users of the data in question;
  • defining through an implementing act a list of "high value datasets" belonging to six thematic categories (geospatial, earth observation and environment, meteorological, statistics, companies and company ownership, mobility) to be made available mandatorily free of charge, in machine readable format and through APIs.   

Article 14 of the open data Directive empowers the Commission to adopt implementing acts laying down a list of specific high value datasets belonging to the six thematic categories set out in the Annex and held by public sector bodies and public undertakings. In order to make the reuse of these datasets more efficient, the Directive provides that they shall be available free of charge, machine-readable, provided via APIs and, where relevant, as a bulk download. The implementing acts may also specify the arrangements for the publication and re-use of high value datasets, which shall be compatible with open standard licences. They may include terms applicable to re-use, formats of data and metadata and technical arrangements for dissemination. Work on the definition of high value datasets, including an impact assessment study, stakeholder consultations and a public consultation and hearing, took place in 2020. An implementing regulation on high value datasets will be adopted in the second half of 2021.

2. Review of the 2012 Recommendation on access to and preservation of scientific information, focusing on:

  • evaluating the uptake of the 2012 Recommendation as well as its effectiveness in creating a level playing field for Member States, researchers and academic institutions;
  • updating and reinforcing the overall policy with the development of guidelines on opening up research data and the creation of incentive schemes for researchers sharing data;
  • ensuring coherence with the European Open Science Cloud.

3.  Development of guidance on private sector data sharing

The Commission has proposed guidance to companies that wish to make data available to other companies or to public authorities, which lays down principles of fair data sharing practices and includes guidance on legal, business and technical aspects of B2B and B2G data sharing.

Following an open selection process, the Commission appointed in November 2018 23 experts to an Expert Group on Business-to-Government Data Sharing. The conclusions and recommendations of the Expert Group were published in a report released end of 2019; the report findings are used as input for possible future Commission initiatives on B2G data sharing (https://digital-strategy.ec.europa.eu/en/news/experts-say-privately-held-data-available-european-union-should-be-used-better-and-more).

The report underlines several times the importance of standardisation for facilitating data sharing. “The expert group recommends that the Digital Europe Programme invests in the development of common standards for data, metadata, representation and standardised transfer protocols. Building on existing EU programmes, initiatives and working groups, such as the CEF program and Interoperable Europe and the "Multi-stakeholder platform for ICT standardisation", the expert group recommends prioritising those standards that are most generally used over creating new ones. The chosen standards should then be further developed, possibly in cooperation and with the support of a European standardisation body. Agreeing on a (set of) common standard(s) and promoting this among the Member States will substantially improve the interoperability among data catalogues and the data exchange between Member States and private companies and civil-society organisations.”.

On 19 February 2020, the Commission adopted the Communication on “A European strategy for data” — a set of measures aiming at making the EU a leader in a data-driven society. Creating a single market for data will allow it to flow freely within the EU and across sectors for the benefit of businesses, researchers and public administrations.

A first priority for operationalising its vision is to put in place an enabling legislative framework for the governance of common European data spaces. Such governance structures should support decisions on what data can be used in which situations, facilitate cross-border data use, and prioritise interoperability requirements and standards within and across sectors, while taking into account the need for sectoral authorities to specify sectoral requirements. The framework will reinforce the necessary structures in the Member States and at EU level to facilitate the use of data for innovative business ideas, both at sector- or domain-specific level and from a cross-sector perspective.

On 25 November 2020 the Commission adopted the proposal for a Regulation on European data governance. (COM(2020) 767 final). It aims to foster the availability of data for use by increasing trust in data intermediaries and by strengthening data-sharing mechanisms across the EU. The Regulation will facilitate data sharing across the EU and between sectors to create wealth for society, increase control and trust of both citizens and companies regarding their data, and offer an alternative European model to data handling practice of major tech platforms.

(A.3) References 

(B.) Requested actions

The Communication on ICT Standardisation Priorities for the Digital Single Market proposes priority actions in the domain of Big Data. Actions mentioned herein below reflect some of them.

Action 1: SDO's to undertake the DCAT-AP standardisation process, considering:

Action 2: Promote standardisation in/via the open data infrastructure, especially the European Data Portal being deployed in 2015-2020 as part of the digital service infrastructure under the Connecting Europe Facility programme,

Action 3: Support of standardisation activities at different levels: H2020 R&D&I activities; support for internationalisation of standardisation, in particular for the DCAT-AP specifications developed in the former ISA2 programme, currently Interoperable Europe (see also action 2 under eGovernment chapter), and for specifications developed under the Future Internet public-private-partnership, such as FIWARE NGSI-LD and FIWARE CKAN. Standardisation can also be enhanced by SDOs creating guidelines for using existing standards and for using Core Vocabularies, as well as Core Public Service Application Profile implemented by the former ISA2 programme, currently Interoperable Europe; new activities launched by the first implementations of the Digital Europe Programme and the legal framework progressively put in place following the Commission Communication on “A European strategy for data”.

Action 4: Bring the European data community together, including through the H2020 Big Data Value public-private partnership (http://www.bdva.eu/), to identify missing standards and design options for a big data reference architecture, taking into account existing international approaches, in particular the work in ISO/IEC JTC 1 SC 42. In general attention should be given to the four pillars of (semantic) discovery, privacy-by-design, accountability for data usage (licensing), and exchange of data together with its metadata, through the use of Asset Description Metadata Schema (for describing reusable solutions) implemented by the former ISA2 programme, currently Interoperable Europe.

Action 5: CEN to coordinate with W3C on standardising the DCAT-AP as well as the other vocabularies provided by the former ISA2 programme, currently Interoperable Europe and to prevent incompatible changes as well as concerning the conditions for availability of the standard(s).

Action 6: The European Commission together with EU funded pilots and projects that develop technical specifications for the provision of cross-border services (e.g., from  , CEF/DEP pilots), which need to be referenced in public procurement, to liaise with SDOs to consider how to address their possible standardisation.

Action 7: The European Commission to initiate broad exchanges with SDOs and all stakeholders, in particular with those in relation with the European Data Spaces, on the role of standards and open source in the context of the improvement of data interoperability within and across sectors, the data economy including in particular consideration of the twin transitions. This should inter alia aim for the identification of the possible functional or standardisation gaps and promote further coordination amongst different SDOs

Action 8: SDOs to look into possible standardisation needs arising from the proposed European Data Governance Act.

Action 9: SDOs to look into possible standardisation needs arising from the proposed EU draft Data Act.

(C.) Activities and additional information 

(C.1) Related standardisation activities 
CEN & CENELEC

CEN/TC 468 ‘Preservation of digital information’ works on the functional and technical aspects of the preservation of digital information. In this field, the committee will develop a structured set of standards, specifications and reports, addressing business requirements, including compliance with the European legislative and regulatory framework (e.g. GDPR, eIDAS).

ETSI

ETSI TC SmartM2M is developing a set of reference ontologies, mapped onto the oneM2M Base Ontology. This work has commenced with the SAREF ontology, for Smart Appliances, but is being extended to add semantic models for data associated with smart cities, industry and manufacturing, smart agriculture and the food chain, water, automotive, eHealth/aging well and wearables (https://saref.etsi.org/).

ETSI ISG CIM (cross-cutting Context Information Management) has developed the NGSI-LD API (GS CIM 009 v1.6.1) which builds upon the work done by OMA Specworks and FIWARE. NGSI-LD is an open framework for exchange of contextual information for smart services, aligned with best practice in linked open data. It is now capable of attesting provenance of information as well as supporting fine-grained encryption (GS CIM 019). Ongoing activities involve increased interoperability with oneM2M data sources. Applications and use cases are extended to Digital Twins, eHealth, analytics for government services, federated data spaces, GDPR-compatible data sharing.

ETSI’s ISG MEC is developing a set of standardized Application Programming Interfaces (APIs) for Multi-Access Edge Computing (MEC). MEC technology offers IT service and Cloud computing capabilities at the edge of the network. Shifting processing power away from remote data centres and closer to the end user, it enables an environment that is characterized by proximity and ultra-low latency, and provides exposure to real-time network and context information.

ETSI’s TC ATTM committee has specified a set of KPIs for energy management for data centres (ETSI ES 205 200-2-1). These have been combined into a single global KPI for data centres, called DCEM, by ETSI’s ISG on Operational energy Efficiency for Users (OEU), in ETSI GS OEU 001.

SC USER: has produced a set of documents related to “User-Centric approach in the digital ecosystem”.
Note: this body of work also applies to several other sections of the ICT rolling plan, such as, IoT, eHealth, Cyber security, e-privacy, accessibility, but are documented only once. 

ETSI TR 103 438  User Group; User centric approach in Digital Ecosystem
ETSI EG 203 602  User Group; User Centric Approach: Guidance for users; Best practices to interact in the Digital Ecosystem
ETSI TR 103 603  User Group; User Centric Approach; Guidance for providers and standardisation makers
ETSI TR 103 604  User Group; User centric approach; Qualification of the interaction with the digital ecosystem
ETSI TR 103 437  Quality of ICT services; New QoS approach in a digital ecosystem

SC USER has initiated an action to finalise the project by defining and implementing a proof of Concept of a “Smart interface for digital ecosystem”, which is a user interface that meets the needs and expectations of the user at his request, and is an "Intelligent", "highly contextualised" personalisation, agile and proactive interface with an integrated QoS. This project will be based on the Smart Identity concept.

ISO

ISO/TC 46/SC 4 Technical interoperability

  • ISO 15836 Information and documentation — The Dublin Core metadata element set
ISO/IEC JTC1

In 2018 JTC 1/SC 42 Artificial Intelligence was formed, and contains a WG 2 which is responsible for the Big Data work program.

SC 42  has published the following published big data standards:

ISO/IEC 20546:2019 Information technology -- Big Data -- Overview and Vocabulary (https://www.iso.org/standard/68305.html?browse=tc)

ISO/IEC TR 20547-2:2018 Information technology -- Big data reference architecture -- Part 2: Use cases and derived requirements (https://www.iso.org/standard/71276.html?browse=tc)

ISO/IEC TR 20547-5:2018 Information technology -- Big data reference architecture -- Part 5: Standards roadmap (https://www.iso.org/standard/72826.html?browse=tc)

ISO/IEC 20547-1: Information technology -- Big Data reference architecture -- Part 1: Framework and application process

ISO/IEC 20547-3: Information technology -- Big Data reference architecture -- Part 3: Reference architecture

SC 42 is progressing the following current big data projects, which are expected to complete in the next year:

ISO/IEC 24688: Information technology -- Artificial Intelligence -- Process management framework for Big data analytics

See for further information: https://www.iso.org/committee/6794475.html 

Built on its foundation standard that is ISO/IEC 38500 (Information technology - Governance of IT for the Organization), JTC 1/SC 40 IT service management and IT governance has developed or is developing the following standards on Governance of Data:

38505-1: Information technology - Governance of IT - Part 1: Application of ISO/IEC 38500 to the governance of data

38505-2: Information technology - Governance of IT - Part2: Implications of ISO/IEC38505-1 for Data Management

38505-3: Information technology - Governance of Data - Part3: Guidelines for Data Classification

See for further information https://www.iso.org/committee/5013818.html

ISO/IEC JTC1 SC32 on "Data management and interchange" work on standards for data management within and among local and distributed information systems environments. SC 32 provides enabling technologies to promote harmonization of data management facilities across sector-specific areas. https://www.iso.org/committee/45342.html

ITU-T

ITU-T SG13 Recommendation ITU-T Y.3600 “Big data - Cloud computing based requirements and capabilities" covers use-cases of cloud computing based big data to collect, store, analyse, visualize and manage varieties of large volume datasets:
https://www.itu.int/rec/T-REC-Y.3600/en

Also, SG13 published Y.3600-series Supplement 40 "Big Data Standardisation Roadmap” which will be revised in 2022:
https://www.itu.int/rec/T-REC-Y.Sup40/en

SG13 has 10 ongoing work items on big data, in particular, it is working on big data functional requirements for data integration (Y.bdi-reqts). It approved

Recently approved ITU-T Recommendations on big data, includes Y.3605 (09/2020) with big data reference architecture and functional architecture of big data-driven networking Y.3653 (04/2021).

See a flipbook “Big Data - Concept and application for telecommunications":
https://www.itu.int/en/publications/Documents/tsb/2019-Big-data/mobile/index.html.

The work programme of SG13 is available at: http://itu.int/itu-t/workprog/wp_search.aspx?sg=13

More info: https://www.itu.int/en/ITU-T/studygroups/2017-2020/13

ITU-T SG20 “Internet of things (IoT) and smart cities & communities (SC&C)” is studying big data aspects of IoT and SC&C. ITU-T Study Group 20 developed Recommendation ITU-T Y.4114 “Specific requirements and capabilities of the IoT for big data” which complements the developments on common requirements of the IoT described in Recommendation ITU-T Y.4100/Y.2066 and the functional framework and capabilities of the IoT described in Recommendation ITU-T Y.4401/ Y.2068 in terms of the specific requirements and capabilities that the IoT is expected to support in order to address the challenges related to big data. This Recommendation also constitutes a basis for further standardization work such as functional entities, application programming interfaces (APIs) and protocols concerning big data in the IoT.

ITU-T SG20 also published Recommendation ITU-T Y.4461 “Framework of open data in smart cities” that clarifies the concept, analyses the benefits, identifies the key phases, roles and activities and describes the framework and general requirements of open data in smart cities, Recommendation ITU-T Y.4473 “SensorThings API - Sensing” that specifies the SensorThings application programming interface (API) which provides an open standard-based and geospatial-enabled framework to interconnect Internet of things (IoT) devices, data, and applications over the Web, Recommendation ITU-T Y.4472 “Open data application programming interface (APIs) for IoT data in smart cities and communities” which presents a complete set of Open APIs dedicated to smart cities offering different features covering the needs of interoperable smart city framework development, and Supplement ITU-T Y.Suppl.61 “Features of application programming interface (APIs) for IoT data in smart cities and communities” which studies the concept and potential of developing a secured open and interoperable APIs in the context of IoT deployment and open data management in smart cities.

The work programme of SG20 is available at: https://www.itu.int/ITU-T/workprog/wp_search.aspx?sg=20

More info: https://itu.int/go/tsg20

The ITU-T Focus Group on Data Processing and Management (FG-DPM) to support IoT and Smart Cities & Communities was set up in 2017. The Focus Group played a role in providing a platform to share views, to develop a series of deliverables, and showcasing initiatives, projects, and standards activities linked to data processing and management and establishment of IoT ecosystem solutions for data focused cities. This Focus Group concluded its work in July 2019 with the development of 10 Technical Specifications and 5 Technical reports. The complete list of deliverables is available here https://itu.int/en/ITU-T/focusgroups/dpm

ITU-T SG17 has approved six standards on big data and open data security:

ITU-T X.1147 “Security requirements and framework for big data analytics in mobile internet services”

ITU-T X.1376 “Security-related misbehaviour detection mechanism based on big data analysis for connected vehicles”

ITU-T X.1603 “Data security requirements for the monitoring service of cloud computing”

ITU-T X.1750 “  Guidelines on security of big data as a service for Big Data Service Providers”

ITU-T X.1751 “Security guidelines on big data lifecycle management for telecom operators”

ITU-T X.1752 “Security guidelines for big data infrastructure and platform” (under approval as of Sept 2021).

More info: https://www.itu.int/en/ITU-T/studygroups/2017-2020/17

ITU-T Focus Group on Artificial Intelligence (FG-AI4H), established in partnership with ITU and WHO, is working towards establishing a standardized assessment framework for the evaluation of AI-based methods for health, diagnosis, triage or treatment decisions.

https://www.itu.int/en/ITU-T/focusgroups/ai4h/

IEEE

IEEE has a series of standards projects related to Big Data (mobile health, energy efficient processing, personal agency and privacy) as well as pre-standardisation activities on Big Data and open data. Some relevant standards activities include:

IEEE 1752 Series of standards on mobile health data,
IEEE 3652.1-2020, IEEE Guide for Architectural Framework and Application of Federated Machine Learning,
IEEE P3800 Data Trading System: Overview, Terminology, and Reference Model
IEEE P7002, IEEE Draft Standard for Data Privacy Process,
IEEE P7004, IEEE Draft Standard for Child and Student Data Governance,
IEEE P7005, IEEE Draft Standard for Transparent Employer Data Governance,
IEEE P7015,  Standard for Data and Artificial Intelligence (AI) Literacy, Skills, and Readiness.

There also are pre-standards programs, including:

 For more information, see: https://ieeesa.io/eu-rolling-plan

OASIS

The OASIS Open Data Protocol (Odata) TC works to simplify the querying and sharing of data across disparate applications and multiple stakeholders for re-use in the enterprise, Cloud, and mobile devices. A REST-based protocol, OData builds on HTTP and JSON using URIs to address and access data feed resources. OASIS OData standards have been approved as ISO/IEC 20802-1:2016 and ISO/IEC 20802-2:2016.

The OASIS ebCore TC maintains the ebXML RegRep standard that defines the service interfaces, protocols and information model for an integrated registry and repository. The repository stores digital content while the registry stores metadata that describes the content in the repository. RegRep was used in the EU TOOP project, which was concluded in 2021.

RegRep can be used in conjunction with ebXML Messaging including AS4 using a recently developed binding for the Registry Services of the OASIS ebXML RegRep Version 4.0 OASIS Standard. This binding is compatible with the AS4 profile of ebXML Messaging as used, for example, in the European Commission’s eDelivery Building Block, and complements the existing protocol bindings specified in OASIS RegRep Version 4.0. This AS4 binding is also of relevance to the Once-Only Technical System for the Single Digital Gateway (see section 3.2.4, eGovernment).

OGC

The Open Geospatial Consortium (OGC) defines and maintains standards for location-based, spatio-temporal data and services. The work includes, for instance, schema allowing description of spatio-temporal sensor, image, simulation, and statistics data (such as "datacubes"), a modular suite of standards for Web services allowing ingestion, extraction, fusion, and (with the web coverage processing service (WCPS) component standard) analytics of massive spatio-temporal data like satellite and climate archives. OGC also contributes to the INSPIRE project.

http://www.opengeospatial.org

oneM2M

The oneM2M Partnership Project has specified the oneM2M Base Ontology (oneM2M TS-0012, ETSI TS 118 112) to enable syntactic and semantic interoperability for IoT data. The oneM2M standard defined a middleware layer, residing between a lower layer, comprising IoT devices and communications technologies, and an upper layer of IoT applications. Thus, it enables a wide range of interactions between applications and the underlying technologies needed to source data from connected devices and sensors as well as sharing of data from many sensors that are managed by different device owners and service providers. All oneM2M specifications are publicly accessible at at Specifications (onem2m.org).

W3C

DCAT vocabulary (done in the linked government data W3C working group) 

http://www.w3.org/TR/vocab-dcat/

After a successful Workshop on Smart Descriptions & Smarter Vocabularies (SDSVoc) ( www.w3.org/2016/11/sdsvoc/) W3C created the Dataset Exchange Working Group ( https://www.w3.org/2017/dxwg) to revise DCAT, provide a test suite for content negotiation by application profile and to develop additional relevant vocabularies in response to community demand. 

Work on licence in  ODRL continues and has reached a very mature state:  https://www.w3.org/TR/odrl-model/ and  https://www.w3.org/TR/vocab-odrl/

The Data on the web best practices WG has finished its work successfully  (https://www.w3.org/TR/dwbp) also issuing data quality, data usage vocabularies  (https://www.w3.org/TR/vocab-dqv; https://www.w3.org/TR/vocab-duv)

(C.2) Other activities related to standardisation
ISA and ISA2 programme of the European Commission

The DCAT application profile (DCAT-AP) has been defined. DCAT-AP is a specification based on DCAT (a RDF vocabulary designed to facilitate interoperability between data catalogues published on the web) to enable interoperability between data portals, for example to allow metasearches in the European Data Portal that harvests data from national open data portals.

Extensions of the DCAT-AP to spatial (GeoDCAT-AP: https://joinup.ec.europa.eu/collection/semantic-interoperability-community-semic/news/new-version-geodcat-ap-has-been-officially-released) and statistical information (StatDCAT-AP: https://joinup.ec.europa.eu/asset/stat_dcat_application_profile/home ) have also been developed.

https://joinup.ec.europa.eu/asset/dcat_application_profile/description

https://joinup.ec.europa.eu/collection/semantic-interoperability-community-semic/solution/dcat-application-profile-data-portals-europe/release/211

Core Vocabularies can be used and extended in the following contexts:

  • Development of new systems: the Core Vocabularies can be used as a default starting point for designing the conceptual and logical data models in newly developed information systems.
  • Information exchange between systems: the Core Vocabularies can become the basis of a context-specific data model used to exchange data among existing information systems.
  • Data integration: the Core Vocabularies can be used to integrate data that comes from disparate data sources and create a data mesh-up.
  • Open data publishing: the Core Vocabularies can be used as the foundation of a common export format for data in base registries like cadastres, business registers and public service portals.

The Core Public Service Vocabulary Application Profile allows harmonised ways and common data models to represent life events, business events and public services across borders and across-sectors to facilitate access.

ADMS is a standardised vocabulary which aims at helping publishers of semantic assets to document what their assets are about (their name, their status, theme, version, etc) and where they can be found on the Web. ADMS descriptions can then be published on different websites while the asset itself remains on the website of its publisher.

More info can be found in the following links:

      https://joinup.ec.europa.eu/collection/semantic-interoperability-community-semic/core-vocabularies

      https://ec.europa.eu/isa2/solutions/core-public-service-vocabulary-application-profile-cpsv-ap_en

      https://joinup.ec.europa.eu/collection/semantic-interoperability-community-semic/adms

CEF

Under the framework of the Connecting Europe Facility programme support to the interoperability of metadata and data at national and EU level is being developed through dedicated calls for proposals. The CEF group is also promoting training and webinars for using the “context broker”, in collaboration as appropriate with the NGSI-LD standards group ETSI ISG CIM.

AquaSmart

AquaSmart enables aquaculture companies to perform data mining at the local level and get actionable results.

The project contributes to standardisation of open data in aquaculture. Results are exploited through the Aquaknowhow business portal.

www.aquaknowhow.com

Automat

The main objective of the AutoMat project is to establish a novel and open ecosystem in the form of a cross-border Vehicle Big Data Marketplace that leverages currently unused information gathered from a large amount of vehicles from various brands.

This project has contributed to standardisation of brand-independent vehicle data.

www.automat-project.eu

BodyPass

BodyPass aims to break barriers between health sector and consumer goods sector and eliminate the current data silos.

The main objective of BodyPass is to foster exchange, linking and re-use, as well as to integrate 3D data assets from the two sectors. For this, BodyPass adapts and creates tools that allow a secure exchange of information between data owners, companies and subjects (patients and customers).

The project aims at standardizing 3D data

www.bodypass.eu

European Commission

A smart open data project by DG ENV led directly to the establishment of the Spatial Data on the Web Working group, a collaboration between W3C and the OGC.

G8 Open Data Charter

In 2013, the EU endorsed the G8 Open Data Charter and, with other G8 members, committed to implementing a number of open data activities in the G8 members’ collective action plan (publication of core and high-quality datasets held at EU level, publication of data on the EU open data portal and the sharing of experiences of open data work).

Future Internet Public Private Partnership programme

Specifications developed under the Future Internet public-private-partnership programme (FP7):

FIWARE NGSI extends the OMA Specworks NGSI API for context information management that provides a lightweight and simple means to gather, publish, query and subscribe to context information. FIWARE NGSI can be used for real-time open data management. ETSI’s ISG for cross-cutting Context Information Management (CIM) has developed the NGSI-LD API (GS CIM 004 and GS CIM 009) which builds upon the work done by OMA Specworks and FIWARE. The latest FIWARE software implements the newest ETSI NGSI-LD specification.

FIWARE CKAN: Open Data publication Generic Enabler. FIWARE CKAN is an open source solution for the WG10 publication, management and consumption of open data, usually, but not only, through static datasets. FIWARE CKAN allows its users to catalogue, upload and manage open datasets and data sources. It supports searching, browsing, visualising and accessing open data

Big Data Value cPPP TF6 SG6 on big data standardisation

In the big data value contractual public-private-partnership, a dedicated subgroup (SG6) of Task Force 6: Technical deals with big data standardisation.

Fair Principles and the GO Fair Initiatives 

FAIR Principles stands for F indability,  A ccessibility,  I nteroperability, and  R euse of digital assets. The principles emphasise machine-actionability (i.e., the capacity of computational systems to find, access, interoperate, and reuse data with none or minimal human intervention) data as a result of the increase in volume, complexity, and creation speed of data. GO FAIR is a community that has been working towards implementations of the FAIR Guiding Principles