Skip to main content

e-Infrastructures for data and computing intensive science and the European Open Science Cloud (RP2023)

(A.) Policy and legislation

(A.1) Policy objectives

Research data and computing infrastructures fostering a paradigm shift in science (Open Science)

Data driven science today pervades all research fields blurring geographical and disciplinary boundaries. The technological and digital progress unfolding over the last few decades produced more effective scientific instruments, which generated a rapid increase in research data volumes and availability across a wide range of scientific disciplines. The European Commission in FP6, FP7 and Horizon 2020 funded several projects to establish and consolidate a European e-infrastructures environment and to build the European Open Science Cloud, a federated and open multi-disciplinary environment where research data can be found and re-used, with tools and services for research, innovation and educational purposes. Underpinning the efforts of the research communities, e-infrastructures have fostered innovation and scientific progress across disciplines and between the private and public sector. A large number of data e-Infrastructures, mixing the capabilities of scientific communities and technology providers, have been launched in domains of astronomy, earth and ocean observation, climate, environment and biodiversity, etc. Moreover, pan European e-Infrastructures initiatives were launched across disciplinary domains providing a participatory network of open access repositories at European scale. These initiatives fill the gap between user-application and generic e-Infrastructure layers for high-volume storage, data interoperability, high-performance computing and connectivity layers. This framework of e-science services enabled the progress of Open Science practices to improve the quality, efficiency and responsiveness of research.

Despite the greater possibilities of sharing and accessing research data and the Commission policies on open research data, promotion of “openness” is not sufficient to realise the full potential of communication and re-use of research data. The vast amounts of research generated data are still dispersed across thousands of venues. A March 2016 article in Nature proposed guiding principles for scientific data management and stewardship by introducing the FAIR acronym, which stands for Findable, Accessible, Interoperable and Re-usable. Soon after the publication, the FAIR Principles became one of the cornerstones of EU’s Open Science policy and have been rapidly adopted by publishers, funders, and other stakeholders from across the research community.

Building on the existing EU-funded e-Infrastructures and to enable the development and uptake of Open Science in Europe the EC  proposed and is promoting the creation of a European Open Science Cloud (EOSC), as presented in the Communication "European Cloud Initiative". EOSC essentially involves the federation of existing research data infrastructures and the realisation of a Web of FAIR Data and Related Services for Science, making research data interoperable and machine actionable. It fosters the definition, implementation and further development of advanced solutions for the effective provisioning and use of high quality scientific data, with metadata descriptors, ease of access, interoperability and reusability, fully implementing the FAIR principles. Therefore, the application of standards and recommendations is of utmost importance in order to allow for interoperability, avoid fragmentation and improve the efficiency and effectiveness of research.

The European Commission, with the European data strategy, aims to make the EU a leader in a data-driven society. Among other actions, the Strategy intends to foster the rollout of common European data spaces in crucial sectors such as industrial manufacturing, green deal, mobility or health: EOSC has been recognised as the European digital space for research. The work that has been conducted within EOSC to enable interoperability across research domains and data discovery to support multi-disciplinary reuse is critical to supporting collaboration with the European data spaces. Research infrastructures within the ESFRI roadmap already play a key role in EOSC. Engaging further with the research communities will be key to developing an EOSC for and by the researchers. Strong links with research domains will naturally foster opportunities for collaboration with the data spaces.

To complement the access to the wealth of European research data, with the new Regulation for the European High Performance Computing Joint Undertaking and the Coordinated Plan on Artificial Intelligence (AI) the European Commission is also ensuring the capacity to process large volumes of information  with services closer to European researchers and innovators.

(A.2) EC perspective and progress report

Research/science funders have a common problem when tackling the area of research data infrastructure. The landscape is geographically fragmented and different disciplines have different practices. It is difficult to build critical mass and provide common services to different scientific disciplines and to take advantage of economies of scale. Some scientific communities are pushing the envelope and adopting new technologies while others are lagging behind. Scientists are, at the end of the day, the generators and users of research data in their experiments, simulations, visualization of complex data arrays, etc. There is a need to bring together capabilities from different scientific fields and also the competences of technology and service providers to use the potential of ICT.

Interoperable data infrastructures will allow researchers and practitioners from different disciplines to access and process the data they need in a timely manner. The implementation of the FAIR principles as standard practice for research data will enable collaborations across different domains of science.

Today, EU-funded e-Infrastructures and EOSC resources play a fundamental role in the life of European researchers.

In the initial phase of development of EOSC from 2016 to 2020, the EC made a financial investment of approximately €350 million to begin building the foundations of EOSC through project calls in Work Programmes in Horizon 2020. This investment was targeted to develop a new pan-European access mechanism to public e-infrastructures, to coordinate related national activities, to connect European research infrastructures (RIs) to EOSC, to set up and begin the implementation of the FAIR guiding principles, and to start a FAIR-compliant certification scheme for research data infrastructures. These projects have involved the community of stakeholders of EOSC and have been steadily developing the broader EOSC ecosystem.

The initial development phase under Horizon 2020 supported more than 35 projects, laying the foundations of EOSC and showcasing its diversity and complexity. The EOSCpilot project engaged extensively with stakeholders and proposed a governance framework and policies, as well as developing interoperability pilots across scientific domains. EOSC-hub brought together service providers to create a single contact point to discover, access and use a wide range of resources for data-driven research. Five ongoing science cluster projects will connect the European Strategy Forum on Research Infrastructures (ESFRI) projects and landmarks to EOSC in the domains of environmental sciences, life sciences, astronomy and particle physics, photon and neutron sciences, and social sciences and humanities. Five regional projects aim to coordinate the efforts of national and thematic initiatives in contributing to EOSC through groupings of European countries.

Under the first implementation phase of EOSC projects have been actively working on recommendations for the adoption of practices models and standards. Relevant examples are:

  • the work of OpenAIRE and  EuroCRIS initiatives to expand the CERIF model to also include research outputs. CERIF was initially conceived to document and exchange research information (funding programmes and projects, researchers and research institutions, etc.) and has since been adopted by many Member States and institutions
  • in the context of FAIR data, the project FAIRsFAIR is coordinating an analysis of the European Framework for audit and certification of digital repositories that comprise three certification instruments, with increasing degrees of complexity and depth:
  • several EOSC related projects (notably the 5 ESFRI Cluster projects) are strongly contributing in promoting practices and standards in disciplinary metadata for description and re-use of research data
  • the RDA FAIR Data Maturity Model Working Group has developed a common set of core assessment criteria for FAIRness and a generic and expandable self-assessment model for measuring the maturity level of a dataset.

Moreover, the EOSC Executive Board (an EC expert group called to prepare the ground for the second phase of the EOSC implementation) produced a set of documents and recommendations with relevance to establishing governance, principles, architecture and interoperability in the EOSC. Among these documents, the EOSC Interoperability Framework Report is the foundation of the work that the project EOSC Future is carrying out to establish a concrete EOSC Interoperability Framework (EOSC IF).

(A.3) References
  • COM(2012) 401 final: Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions — Towards better access to scientific information: Boosting the benefits of public investments in research.
  • COM(2012) 4890 final: Commission Recommendation on access to and preservation of scientific information.

(B.) Requested actions

Action 1: Supporting standardisation within Horizon Europe INFRAEOSC Destination. The attention on standardisation will be included in the Destination INFRAEOSC, part of the Horizon Europe Work programme for Research Infrastructures (Pillar 1 Excellent Science). Notably for topics related to building the federating core of EOSC, creating FAIR enabling services and supporting the implementation of the FAIR principles, the European Commission will strengthen the objective of contribution to the adoption of practices and standards applicable to the EOSC and that will potentially have a larger impact on other data initiatives such as the European common Data Spaces.

Action 2: Recognising RDA as a fundamental contributor to standards on data.The Research Data Alliance (RDA) is not primarily a standardisation body but is a mechanism to speed-up the adoption of standards for research data and computing infrastructures. The RDA, with its form of Multi-Stakeholder Platform, develops recommendations that have the potential of becoming ICT specifications. There is also an ongoing effort of promoting industrial participation within the RDA processes

Action 3: SDOs to work closely with EOSC and e-infrastructures service providers and RDA. Practices adopted by research digital infrastructures respond to needs that most likely will be valid for wider user communities and operators and will determine new standards on technologies that are emerging through the scientific use and soon will be widespread. Therefore identifying standards needs and developing them in the area of research data (notably in the context of the European Open Science Cloud) will accelerated the uptake of data intensive technologies

(C.) Activities and additional information 

(C.1) Related standardisation activities

Research Data Alliance (RDA)

Supports the Commission’s strategy to achieve global scientific data interoperability in a way that real actors (users and producers of data, service providers, network and computing infrastructures, researchers and their organisations) are in the driving seat. It has memorandums of understanding (MoUs) with related standardisation activities/organisations: IETF, W3C, ICSU/CODATA. Synergies with other organisations/activities will need to be identified in the future.

OAI

The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication.

ITU-T

Regarding the global e-Infrastructure, the ITU is using the digital object architecture (DOA), on which the recommendation ITU-T X.1255 “Framework for discovery of identity management information" is based.

SG11 is addressing the growing problem of counterfeited telecommunication/ICT products and devices, which is adversely affecting all stakeholders in the ICT field (vendors, governments, operators and consumers). Within this activity, SG11 developed number of Recommendations which describe approaches on how to combat the circulation of counterfeit equipment. The Recommendation ITU-T Q.5050 “Framework for solution to combat counterfeit ICT Devices”, which is the first one in the Q.5050-Q.5069-series “Combating counterfeiting and stolen ICT devices”, describes a reference framework with high-level challenges and requirements that should be considered when deploying solutions to combat the circulation and use of counterfeit ICT devices. SG11 developed four Recommendations and six technical reports/supplements on this subject matter. There are six ongoing work items which will define use cases, guidelines as well as the interfaces for data exchange between CEIR and EIR.
All details are available on dedicated web page at: https://itu.int/go/CS-ICT.

SG11 continues developing standards related to combating stolen ICT equipment. SG11 approved Recommendation ITU-T “Framework for Combating the use of Stolen Mobile ICT Devices”.

SG11 set up a new Question 17/11 “Combating counterfeit or tampered telecommunication/ICT software" which initiated a first technical report dedicated to use cases on the combat of multimedia content misappropriation.

More info: https://itu.int/go/tsg11.

ITU-T SG20 approved Recommendation ITU-T Y.4808 “Digital entity architecture framework to combat counterfeiting in IoT”, which provides solutions to deter the spread of counterfeit IoT devices worldwide. ITU-T SG20 also approved Recommendation ITU-T Y.4459 “Digital entity architecture framework for IoT interoperability”, which introduces digital entity architecture and its prospective in addressing interoperability and security among IoT applications.

More info: https://itu.int/ITU-T/go/sg20

SG13 approved new standards on trust for ICT infrastructures and services:

  • Recommendation ITU-T Y.3051 “The basic principles of trusted environment in ICT infrastructure” provides the definition, common requirements and the basic principles of creating trusted environment.
  • Recommendation ITU-T Y.3052 “Overview of trust provisioning for information and communication technology infrastructures and services” describes the key characteristics of trust. In addition, the trust relationship model and trust evaluation based on the conceptual model of trust provisioning are introduced.
  • Recommendation ITU-T Y.3053 “Framework of trustworthy networking with trust-centric network domains”
  • Recommendation ITU-T Y.3056 “Framework for bootstrapping of devices and applications for open access to trusted services in distributed ecosystems”
  • Recommendation ITU-T Y.2501 “Computing power network - Framework and architecture”
  • Recommendation ITU-T Y.2623 “Requirements and framework of industrial Internet networking based on future packet based network evolution”

SG13 continue working on the attributes that can represent trustworthiness, which can be applied to ICT infrastructures and services. There are several on-going work on architecture for trust enabled service provisioning, trust index to evaluate and quantify trustworthiness for ICT infrastructures and services etc. From the perspectives of standardization, trust should be quantitatively and/or qualitatively calculated and measured, which is used to evaluate the values of physical components, value-chains among multiple stakeholders, and human behaviors including decision making.

FG NET2030 technical report “Network 2030- Additional representative use cases and key network requirements for Network 2030” deals with the key network requirements for huge scientific data applications (astronomical telescopes) and accelerators (Large Hadron Collider).http://itu.int/ITU-T/go/sg13

(C.2) Other activities related to standardisation

Related topics in H2020 WP on e-Infrastructures  and EOSC (proposals selected within these calls may contribute to standardisation):

EINFRA-1-2014: Managing, preserving and computing with big research data
EINFRA-3-2014: Towards global data e-Infrastructures — research data alliance
EINFRA-8-2014: Research and education networking — GÉANT
INFRASUPP-7-2014: e-Infrastructure policy development and international cooperation
EINFRA-22-2016: User driven e-infrastructure innovation
EINFRA-21-2017: Platform-driven e-infrastructure innovation
EINFRA-12-2017: Data and Distributed Computing e-Infrastructure for Open Science
INFRASUPP-02-2017: Policy and International cooperation measures for research infrastructures (RDA)
INFRAEOSC-05-2018-2019 - Support to the EOSC Governance
INFRAEOSC-04-2018 - Connecting ESFRI infrastructures through Cluster projects
INFRAEOSC-02-2019 - Prototyping new innovative services
INFRAEOSC-01-2018 - Access to commercial services through the EOSC hub
INFRAEOSC-03-2020 - Integration and consolidation of the existing pan-European access mechanism to public research infrastructures and commercial services through the EOSC Portal
INFRAEOSC-07-2020 - Increasing the service offer of the EOSC Portal 

Related topics in Horizon Europe on EOSC (proposals selected within these calls may contribute to standardisation):

HORIZON-INFRA-2021-EOSC-01-03 - Deploying EOSC-Core components for FAIR
HORIZON-INFRA-2021-EOSC-01-04 - Innovative and customizable services for EOSC
HORIZON-INFRA-2021-EOSC-01-05 - Enabling discovery and interoperability of federated research objects across scientific communities
HORIZON-INFRA-2021-EOSC-01-06 - FAIR and open data sharing in support of cancer research
HORIZON-INFRA-2022-EOSC-01-04 - Support for initiatives helping to generate global standards, specifications and recommendations for open sharing of FAIR research data, publications and software
HORIZON-INFRA-2022-EOSC-01-02 - Improving and coordinating technical infrastructure for institutional open access publishing across Europe
HORIZON-INFRA-2022-EOSC-01-03 - FAIR and open data sharing in support of healthy oceans, seas, coastal and inland waters

(C.3) additional information

Interoperability across data and services will allow researchers to gather new insights and open up new territories for scientific discovery by combining data showing correlations which are as of today impossible to explore. Cross-disciplinary Open Science can be seen as the ultimate goal of the EOSC. Interoperability is also a key element to allow EOSC to interact with thematic European Data Spaces and other data lakes . This requires the use of formal standards, protocols and APIs to enable to combine datasets from different disciplines and to compose a pipeline of services for processing and analysing data.

Moreover, institutions and research groups should not be locked into certain tools. Use of open standards and APIs will allow data to be transferred from one tool to another. 
As an example, the emerging EOSC  interoperability framework will specify a series of profiles to help connect multiple approaches to AAI. The aim is to have interoperable AAIs, not a one-size fits all, since different research infrastructures and services use different profiles and users may have preferences over which account and sign-in method is used.

However, due to the fast technological development, interoperability is the most complex challenge within EOSC and any federated digital platform and it has to be considered as a continuous activity.

RDA will be a good support to turn the proposed framework for action for data infrastructures into practice. The global framework of RDA and the role of EOSC among the international initiatives of science clouds will help to consolidate Europe's role of a global partner and a global leader in research data infrastructures.