EPIKH: Exchange Programme to advance e-Infrastructure Know-How (EPIKH)

Published on: 12/04/2013
Document

The “knowledge triangle” refers to the interaction between research & development, education and innovation, which are key drivers of a knowledge-based society. In the European Union, it also refers to an attempt to better link together these key concepts with research and innovation already highlighted by the development of the Lisbon Strategy. The EPIKH project aims to “connect”, through the adoption and use of Grid training infrastructures, along with that of e-Infrastructures, research & development and innovation with education in order to increase the number of users and scientific applications of these platforms.

Image removed.


More specifically, the strategic aims of the EPIKH project are to:

  • Reinforce the impact of e-Infrastructures in scientific research defining and delivering stimulating programme of educational events, including Grid Schools and High Performance Computing courses;
  • Broaden the engagement in e-Science activities and collaborations both geographically and across disciplines.

These goals translate into the following specific actions:

  • Spreading the knowledge about the “Grid Paradigm” to all potential users: both system administrators and application developers through an extensive training programme;
  • Easing the access of the trained people to the e-Infrastructures existing in the areas of action of the project;
  • Fostering the establishment of scientific collaborations among the countries/continents involved in the project.

Policy Context

Europe is heavily investing to create a continental e-Infrastructure based on Grid technology in order to turn the vision of a European Research Area (ERA) into reality. Nowadays, e-Science and e-Infrastructures are considered key enablers of the progress and sustainable development of a country and are concrete means to address the problems of the “digital divide” and the “brain drain” which are endemic in large parts of the world.

In the context of its 6th and 7th Framework Programmes, EC has already co-funded several projects to stimulate and foster e-Science and Grids well outside its borders and in several parts of the world such as Asia, Latin America and the Mediterranean. However, the adoption of the “Grid paradigm” and the effective usage of e-Infrastructures require a capillary activity of knowledge dissemination and training to help scientists to make use of distributed computing capabilities for/in their scientific applications.

Description of target users and groups

The EPIKH consortium unites 23 institutions from 18 countries across four continents: Africa, Asia, Europe and Latin America. EPIKH will mobilise about 115 people for more than 650 researchers, not counting, of course, those who will be trained by and benefit from the project. These are huge figures, confirming the strong interest of these four continents in setting up a programme to improve dissemination of know-how about grid and e-Infrastructures.

Description of the way to implement the initiative

The exchange programme has been implemented in alternating phases:    

  • first, a select team of young researchers visited EPIKH’s EU partners for around one month to be trained as trainers of grid technology (including site administration and application support {‘gridification’});
  • second, EPIKH organised and ran at least two educational events per year and per continent (Africa, Asia, and Latin America), training new users to access and apply a pilot e-Infrastructure on which applications can be deployed, developed, and then used as exemplar use cases in future events.

The EPIKH project improved the educative mission by developing an intensive and diversified training programme in which grids are not the ‘goal’ but rather the ‘means’ by which to develop e-Science applications, gather scientific communities from four continents, and access globally distributed production quality e-Infrastructures.

Technology solution

EPIKH has promoted worldwide the gLite/EMI middleware developed by flagship EU co-funded project such as EGEE series, EGI-InSpire and EMI. To address the technical barriers for non-IT experts to access and use e-Infrastructures, EPIKH has worked hard to define and implement the architecture of a Science Gateway based on widely adopted standards. That could actually make e-Infrastructure access transparent and ubiquitous.

The primary requirements that drove the design and implementation of the Science Gateway framework were:

  • Use of standards;
  • Simplicity;
  • Easiness of use;
  • Re-usability.

Since the very beginning, the idea was not to build a “vertical” solution but rather to create a framework made of small pieces of software that, as LEGO® bricks, could have been customized and re-arranged in many ways in order to fulfil a large variety of applications and end-users. The success of LEGO® bricks resides in the fact that the “basic element” is simple and standard and it can be easily connected to other basic elements to create huge and very complicated constructions. For the development of the basic elements of the Science Gateway, the JSR 286 standard (also known as “portlet 2.0”) was adopted. In this case, “our” LEGO® bricks are standard portlets that can be easily arranged to create different, even complex, portals. As portlet container, the award winning Liferay portal framework has been chosen which offers a rich, easy-to-use "web 2.0" interface using AJAX and other presentation layer technologies. Liferay is currently the most used framework to build Science Gateways in the Grid world.

Users belonging to different organizations may have different roles in the community the Science Gateway is developed for and different privileges on the applications and related data available in the gateway. They access the Liferay-based portal and, according to their role and privileges, they are allowed to run some applications embedded in the Science Gateway and exposed through its user interface.

One of the strengths of this Science Gateway is the decoupling of the authentication (AuthN) phase from the authorisation (AuthZ) one. In order to access the Science Gateway, a user must be both authenticated and authorized but we treat the two steps separately and with different technologies.

User authentication relies on Identity Providers (IdPs) that are members of one or more Identity Federations. We only support federations based on the SAML 2.0 standard specifications and on its implementation done by Shibboleth and SimpleSAMLphp. We currently support several official Identity Federations and some of our Science Gateways are already registered as Service Providers of the eduGAIN inter-federation service within the GÉANT project. We also support all the Identity Providers of the Grid IDentity Pool (GrIDP), a “catch-all” Identity Federation that we have expressly created to gather all the IdPs that do not already belong to any official federations and all the users of the Science Gateway who are not (already) registered in any IdPs. This is particularly important and useful in the contexts where it is necessary to authenticate the so-called “citizen scientist” and let him/her access the e-Infrastructure for dissemination and self-learning purposes. Inside the GrIDP Federation, we have also created a special IdP, the “Social Networks’ Bridge Identity Provider”, that allows people to get authenticated with the same credentials they already have with the most known and populated social networks.

Unlike authentication, user authorisation is carried out at the level of the Science Gateway: users whose request to register is approved by the managers of the portal, are stored in a LDAP-based registry together with the roles they have and the privileges they are granted.

In order to execute applications from within the Science Gateway in a middleware-independent way, the Simple API for Grid Applications (SAGA) standard specifications, defined by the Open Grid Forum, and its JSAGA implementation, have been adopted.

A software layer Grid Engine has been developed which contains a “job engine” and a “data engine” which, in turn, call the JSAGA API for job and data management. The Science GW Interface also contains the functions to interact with the User Tracking DB mentioned above.

Technology choice: Standards-based technology, Mainly (or only) open standards, Open source software

Main results, benefits and impacts

The relevance of the EPIKH joint research programme can be summarised as follows:

  • Groups active in strategic scientific domains were indentified early on and put in contact with colleagues in Europe and other parts of the world widening, at a global scale, the diffusion of scientific (in)formation and best practices;
  • Grid technology was used as a powerful “tool” to impart/improve education on e-Science;
  • The t-Infrastructure built during the schools acted as a seed for building Grid infrastructures in regions targeted by EPIKH where e-Infrastructure sites were not present at the beginning of the project;
  • At the end of the schools, many applications were ready to run on large e-Infrastructures and more users got aware of the benefits of this technology for the progress of science and society. In order to implement its work plan and reach its objectives, EPIKH mobilise more than 100 people for a total of several hundreds of researchers-months, not counting of course the people that were outreached by the project. These huge figures witness the strong interest of the four continents of the world involved in the project in setting up an exchange programme to improve the dissemination of the know-how about Grid and e-Infrastructures.

During its four years EPIKH has been presented in more than 20 “external” events, i.e. events organised by other organisations to which EPIKH representatives have been invited to talk about the project and its outcomes. The full list of such kind of events, including the slides presented, is available online.  

EPIKH has also established several Memoranda of Understanding with other renowned EU projects and has acted has training infrastructure for them, playing a key role in the landscape of grid training and education in the world. As a consequence of this, EPIKH has been mentioned as “an example to follow for monitoring and encouraging local scientific applications to fight the digital divide” in the final recommendations made by the high-level stakeholders of the 6th Conference on “Sharing Knowledge across the Mediterranean” which was held on 6-8 May 2011 in Malta.

EPIKH has also been mentioned as a key training infrastructure and service in the chapter “E-Infrastructures for International Cooperation” of the book “Computational and Data Grids: Principles, Applications, and Design” published by IGI Global in September 2011.

Return on investment

Return on investment: Not applicable / Not available

Track record of sharing

During its lifetime, EPIKH organised and ran more than 40 events, between Grid schools and workshops, and hundreds of secondments. All programs of chools and workshops are available on the EPIKH agenda server, at the respective links:

In total, EPIKH has delivered the huge amount of training of more than 7 participants x years and the overall feedback from students has been 5.1 in a scale from 1 to 6 which is one of the highest scores in the Grid world considering the extent of the project in time, geographic coverage and variety of audiences.

As of the end of 2012, more than 700 researchers from 219 organisations of 47 countries in the world are registered in the Science Gateways implemented using the framework developed and promoted by EPIKH.

Lessons learnt

Grid infrastructures are being built in several areas of the world but, despite the huge investments made by the European Commission and by other funding agencies, both at national and international level, the total number of users is in the order of magnitude O(10^4), much less than O(10^7) which is the order of magnitude of the number of users of the international research and education networks (e. g., GÉANT in Europe). The reasons for this have been investigated through studies promoted by the European Commission itself and mostly reside in the complexity for non-IT-expert users of the Grid security, based on a Public Key Infrastructure, in the little adoption of standards to let different middleware be interoperable among each other, and in the lack of general frameworks to easily build customizable high-level user interfaces.

In the recent past, interesting developments have been independently carried out by the Grid community with the Science Gateways and by the National Research and Education Networks with the Identity Federations to ease, from one side, the access and use of Grid infrastructures and, from the other side, to increase the number of users authorised to access network-based services.

A Science Gateway is a “community-developed set of tools, applications, and data that is integrated via a portal or a suite of applications, usually in a graphical user interface, that is further customized to meet the needs of a specific community (US Teragrid project).”

An Identity Federation is made of “[…] the agreements, standards, and technologies that make identity and entitlements portable across autonomous domains (Burton Group)”. Identity Federations have the aim of setting up and supporting a common framework for different organisations to manage accesses to on-line resources. They are already established in many countries and currently gather a number of people which is in the order of O(10^7).

EPIKH has greatly contributed to the spread of e-Infrastructure in the world thanks to the development and promotion of a standard-based Science Gateway framework and the three most important lessons learnt are:

  • Develop/promote training tools and programs that can address the largest number of beneficiaries;
  • Adopt standards whenever and wherever possible since this represents an investment towards long term sustainability.

Make things simple and develop training materials to show that they are actually even simpler!

Scope: Cross-border, International, Pan-European