Dear users, sign-ins and creation of new accounts are temporarily disabled. Please check again later. We apologise for any inconvenience caused.

Open-DAI: Opening Data Architectures and Infrastructures (Open-DAI)

Published on: 04/04/2014
Document

Open-DAI is ‘Opening Data Architectures and Infrastructures’ for European Public Administrations (PAs). It is a project funded under the ICT Policy Support Programme as part of the Competitiveness and Innovation framework Programme (CIP) Call 2011, running from 2012 through to September 2014.

In a nutshell, the project aims to bring public data into being available for digital public services on cloud computing infrastructures.

Eleven partners from four European countries — Italy, Spain, Sweden and Turkey — work together to enable PAs:

  • to provide open data to a wide audience of potential users through an open data hub. This lets us correlate data and potentially implement new digital public services.
  • to evolve information systems towards an open model with a service-oriented architecture (SOA) in order to overcome the current closed and monolithic structures.
  • to achieve access to host services in a scalable cloud infrastructure.

Innovation, business opportunities and digital services are the three goals driving Open-DAI endeavour.

The concept is that Open Data may be used as a starting point to create new applications and services for PAs, companies and citizens, or to provide a channel for feeding back information to PAs.

The main pursued benefits of Open-DAI develop from following considerations.

Open-DAI concept represents a new model both in new PAs services implementation and in cloud environment deployment.

On one hand, data are being accessed in legacy environments without interfering with existing legacy applications, thus without spending more on consolidated solutions, and avoiding useless duplication of data, but allowing for on-the-fly transformation to provide new data services-based applications and to enrich the SOA approach of the PAs.

On the other hand all the work is done in an environment where the user is not abandoned to build their own data centre or software solution, but where the user can find a managed and integrated cloud environment, with state-of-the-art open source technologies being used to implement new business solutions.

The new services will also be oriented towards use on mobile devices and smartphones (‘apps’), and will cover a wide range of sectors: from transport and info mobility to environmental quality, from localisation services to information for tourism and a new form of communication between citizens and the PAs.

It will be rolled out in different forms in the four countries involved in the project, depending on each geopolitical specific context.

The platform may eventually be used by private partners, which will be able to contribute to enhancing public services, offering services to the public and therefore opening up new market outlets.

VIDEO: http://www.youtube.com/watch?v=L44nS3gLcOA

FB: facebook.com/cipopendai

Twitter: @Open_DAI

LinkedIN: linkedin.com/groups/Open-DAI-5097195

Google Group forum: open-dai@googlegroups.com

Policy Context

Open-DAI project faces the following main challenges:

  • the opening of the huge amount of data stored in PA databases to the wide audience of potential users, through an open data hub in order to correlate data and to implement new digital public services;
  • the evolution toward an open architecture model for the PA information systems, in order to overcome the monolithic and closed architecture models (silos);
  • to facilitate software maintenance of existing silos, enabling PA to pace the evolution of legacy systems with open data initiatives.

Presently it is difficult to use (or re-use) the extensive wealth of information stored in the PA databases, even though it is widely recognised that such information can be helpful to face a wide range of social needs and promote innovation and business opportunities. Current information systems of PAs are based on monolithic architecture models that include all the application software levels (silos).

This approach does not ensure any flexibility when new needs for data and information arise – in particular when the existing applications might be converted into basic services to mash up information, implementing other new applications.

Open-DAI infrastructural and technological choice is consistently based on an open source approach.

What is more, it also builds on other EU-funded open source projects, thus developing cooperation and interaction even at a technological level.

The subjective position of the Open-DAI partners should be considered as a potential barrier to entry for others. On the one hand, the EU funding and related projects increased the expertise of the project partners in the open data services domain. On the other hand, project participants such as CSI-Piemonte (Italy) and Sampas (Turkey) have long-term relationships with PAs, this being, in their respective countries, a competitive advantage with respect to any generic software house.

In fact, CSI-Piemonte and Sampas are sellers of both services/consultancy and cloud infrastructure.

This aspect is even more relevant considering that Open-DAI is released as open source software and, as a consequence, advantages must be based on complementary aspects.

As far as value proposition is concerned, Open-DAI provides a platform for data publication, mainly targeted on PAs’ needs in this respect.

At the same time, it enables other activities: downstream, anyone can build services and apps by using the data services made available by PAs through Open-DAI. Moreover, by orchestrating information from different (internal sources) PAs themselves can achieve better results in terms of data intelligence.

Sources of revenue are mainly related with:

  • start-up and integration of the platform;
  • supply of Open-DAI as a service, on both sides of the market (PAs and data reusers).

A premium pricing model – currently in final assessment stage - actually involves only PAs.

The use of the platform by developers will be charged - not overcoming the marginal costs sustained by Open-DAI to provide the services - only in case their volume of requests overcomes a given threshold (i.e., in a freemium model). The latter approach is adopted to maximise the (expected) positive externalities of open data reuse.

Description of target users and groups

The project target audience is defined following the main principles of the Open-DAI funding programme - the Competitiveness Innovation ICT Policy Support Programme, which aims at stimulating smart sustainable and inclusive growth by accelerating the wider uptake and best use of innovative digital technologies and content by:

  • civic hackers/citizens,
  • governments,
  • businesses.

These three stakeholders groups are considered to be the main targeted audience, throughout Europe, even though obviously starting with the partners’ own countries.

In addition, another two main sub-groups have been identified:

  • the public government stakeholders directly linked to the open-DAI partners involved in the pilot actions;
  • the scientific community involved in research actions - for example in the projects funded under the Seventh Framework Programme, and dealing with cloud technologies or open data policies.

The public government community target includes PAs both internal and external to the consortium, mainly public government adopters and re-users of the OPEN-DAI platform:

  • regional governments
  • municipalities
  • national governments,
  • Lleida municipality (Spain) public community,
  • Barcelona municipality (Spain) public community,
  • Regione Piemonte (Italy) public community,
  • Ordu municipality (Turkey) public community,
  • Karlshamn municipality (Sweden) public community.

The main identified interest of this Community in Open-DAI is based on the adoption, exploitation and re-use of the Open-DAI Platform by these governmental sectors.

As both these aspects will be first of all tested by the four piloting partners municipalities (two in Spain, one in Turkey and one in Sweden) and by the regional Italian partner government (Piedmont), these five partners will be the main relevant official spokesmen of the Open-DAI results towards the internal and external PA communities

The business community target may be described as composed by all potential private stakeholders that might be interested in the future development of the services experimented within Open-DAI, and use of the Open-DAI platform. This community involves mainly software house companies, ICT multinational organisations, telecommunications companies, start-ups dealing with open data, and ICT services providers.

The main identified interest of this community in Open-DAI is based on the platform exploitation to support governments in APIs use or/and development of new business.

The citizens developers / civic hackers community embraces software developers, small and innovative start-ups dealing with open data reuse, apps developers (mobile devices applications), interested in using Open-DAI components and datasets for purposes related with transparency and civic applications. The main identified interest of this community in Open-DAI is based on the building of new pilots through APIs. Open-DAI partners organise hackathons in their countries, to attract these growing community representatives, and to gather feedbacks and evaluations on pilots development results.

The scientific community consists of research institutes approaching open data topics, as well as open source and open standards communities. Several research endeavours can be carried out starting from the activities of Open-DAI. In particular, the academic partner PoliTO (Politecnico di Torino, Italy) is interested in developing threads in this vein. E.g., one particular analysis can focus on the relationships between the 'Government as a platform' principles and the use of SOA and other architectural approaches, and the related impact on exploitation opportunities. Another research line that we deem interesting to address is a comparative review between the current data publishing technologies and applications.

Description of the way to implement the initiative

The project is currently (April 2014) at the start of its third (and final) year of project life. The last three work packages are still in progress, along with WP1 Project Management - driven by CSI-Piemonte partner (Torino, Italy).

Each WP's responsibility has been assigned to a different partner, among the eleven ones constituting the project consortium, in order to create a strong basis for shared consciousness and involvement towards the project goals.

The Work Package 1 activities ensure the smooth coordination of the overall project activities as well as the project consortium; this WP aims on a first hand to enhance information sharing and planning harmonisation and synchronisation among the partners.

The project coordinator (CSI Piemonte) is the day-by-day contact point for partners and for the European Commission and EC project officers, providing support, consolidating data and making sure well-organised communication channels are maintained within the project environment.

A set of management structures, operating procedures and a few auxiliary tools (e.g. data repository setup, structured files to manage partners directory, action-list procedure, periodical meetings on internet) have been defined, to support efficient communication among the partners, according to the quality policies set for the project. Beyond a ‘contributing’ documents set, able to track collective work, specific tools for community management have been adopted, including a wiki and a mailing list.

Project progress monitoring, as well as timely achieving of identified results and outcomes, are among the responsibilities and tasks of this work package. Each WP’s responsible partner is accountable for defining correct detail plans for activities and tasks within the work package, and at the same time for calling for support and involvement of other partners - who will take the lead of sub-tasks and operational responsibilities.

The overall project plan has been conceived so that each WP’s objectives contribute to both achieving a partial result of the whole project plan itself, but also to supporting development of activities to be performed within the related WPs.

The above outlined organisational policy, decided in shared agreement among all consortium partners, has been the main success factor of the project.

Early development of several pilots in the different countries - partners of the consortium - aims to assess the business benefits for both PAs and private organisations by developing new third-party added-value services, focused on the following topics: transport and mobility, localisation and geographic information, environment and pollution. Pilots have been developed within WP5, while their consolidation is part of the Work Package 6 tasks and activities; the whole scheme of pilots' development is based in five places across Europe.

Part of the dissemination strategy (Work Package 7) is focused on improving synergies of the communication strategy with parallel technical activities carried out in WP8 ‘Impact evaluation and exploitation’ and in WP6 ‘Services development, implementation and testing’ which is partially dedicated to implement hackathons -concerning external users.

The communication plan has been conceived as one of the instruments to inform and attract stakeholders interested in exploiting the platform and other relevant Open-DAI results.

The dissemination strategy aims to:

  • represent a valid instrument to better attract the stakeholders groups identified within the exploitation strategy (and in parallel within the communication plan), in order to better support the exploration of the drafted exploitation scenarios (WP8);
  • support the pilot activities (WP6) with a parallel communication strategy (in particular in case of 'hackathon' initiatives) that ensures the involvement of qualified computer programmers and, therefore, the potential enlargement of the Open-DAI user base (and variety of new created services).

Activities performed within the Work Package 8 (Impact Evaluation and Exploitation) have already been brought to interesting conclusions, while possible evolution and exploitation scenarios are still under evaluation by all partners, with the goal of drawing possible evolution lines for the project outcomes – even further to its lifetime.

Technology solution

The Open-DAI project consortium aims to achieve its goals by testing the efficiency and added value of a service-oriented architecture (SOA) and cloud-based architecture on several PAs, and by:

  • implementing a data virtualisation infrastructure deployed into a high-availability infrastructure;
  • simplifying access to legacy vertical applications data, by providing a virtualised version of the databases in the cloud;
  • providing a new SOA data access layer, that could be combined in an appropriate manner in order to improve the products and services;
  • implementing the PA ‘open data’ data hub, exposing it by using classic web services as well as other standard protocols;
  • hosting the PA services into a scalable cloud infrastructure in order to meet the potentially evolving needs.

Open-DAI results from an integration of ‘out-of-the-box’ open source tools, made available in a cloud environment.

Open-DAI's competitive advantages lie in a stronger integration (and automatisation of such integration) with legacy databases (TEIID connectors). This is fundamental, in particular, for dynamic (i.e. frequently changing) data. Moreover, Open-DAI provides broad set of services/formats, and fine-grained API management (through WSO2).

Access to legacy databases is ensured by a data virtualisation layer (JBoss TEIID), which also allows to transform data (e.g., to expose them different formats, or for anonymisation purposes) and to connect different data sources. Using the D2RQ platform, Open-DAI also allows exposing data - even as linked data.

Data publishing is carried out by the open source web server Apache, with WSO2 as Enterprise Service Bus.

Therefore, the existing infrastructure (including servers, storage systems and/or relational DBs) is maintained, as the proposed approach introduces a software layer of data virtualisation between the application logics and the legacy DBs.

Third parties can use calls to RESTful APIs to query or ‘get’ datasets are also managed through WSO2 API manager component.

An adequate (pre-existing) open source license is applied to the project outcomes and its components.

The Open-DAI source code is openly available to third parties through one of the most used collaborative repositories / platforms (e.g. GitHub, SourceForge). All code is hosted on GitHub.

Documentation (both auto- and manually generated) is produced and released. The documentation includes ‘Getting Started’ information explaining the Open-DAI background, purposes and architecture.

Technology choice: Open source software

Main results, benefits and impacts

Open-DAI is an open source platform allowing data virtualisation, transformation, and publication in a cloud environment. Datasets are exposed as services, as a result of an extraction from (actually, a real-time and possibly cached connection with) legacy databases.

Therefore, Open-DAI serves two complementary needs of large organisations:

  • Data exposure beyond its original perimeter: data are released as services, so that third parties can easily reuse them for any purpose. The latter may actually be both an objective within the broader digital strategy of a public body, and a constraint deriving from recently issued legislation.
  • Internal interoperability: using Open-DAI, public bodies can help in ‘breaking’ their information silos, and use the platform to orchestrate and integrate legacy DBs (meeting the requirements of the so-called enterprise information integration, without any direct modification of the legacy logical and physical infrastructure).

 

Benefits, Opportunities & Results

The Open-DAI partners do not expect the Open-DAI platform to be a ‘cash cow’. In fact, this was never in the Open-DAI plans and one could arguably be satisfied by the fact that some realistic cost and demand scenarios make Open-DAI economically sustainable even at the level of a single European country (e.g., Italy) and with a single software maintainer (i.e., CSI Piemonte). Any additional commitment, e.g., by SAMPAS to contribute to the future Open-DAI open source community (and possibly by some other partners) would turn the offer of Open-DAI as a platform into a profitable endeavour.

In any case, the incentive to offer Open-DAI to PAs, even if it would barely break even in financial terms, would be strong. In particular, the consortium judges the fact that an administration connects its legacy database to Open-DAI as an enabler of a whole set of additional economic activities.

Arguably, Open-DAI belongs to the market for software platforms enabling the publication of data as services. However, we could also consider a broader perimeter - i.e., software complementing DBs of large organisations - and a narrower segment - i.e., web portals for (open government) data publication.

On the one hand, beyond data publication, Open-DAI allows orchestration between legacy DBs in a cloud environment, therefore suggesting that Open-DAI can participate, in principle, in the market for software ensuring advanced management of large information systems, if any. However, in fact, we consider this area too broad and heterogeneous to represent a market by itself.

On the other hand, web portals for (open government) data publication still serve the purpose of data disclosure for third parties reuse, but with significantly less advanced features (e.g., no connection with legacy DBs; data published as static files) than Open-DAI and (what we believe being) its direct competitors. In fact, Open-DAI could be seen as an advanced complement of ‘traditional’ open data portals (and a crucial one to make them part of the overall ICT infrastructure of a PA, e.g., creating a pipeline for the seamless publication of selected public sector information as open data).

In general, one could argue that almost all open data-oriented initiatives entail (indirectly) a positive effect on internal efficiency, at least by improving accessibility of information.

At the same time, an advanced management of information systems increasingly encompasses the adoption service-oriented architectures, which also imply sharing data as services, internally and with third parties.

We therefore claim that the market of reference for Open-DAI is, broadly speaking, the market for data publication, and, specifically, the high-end niche represented by the market for publication of data as services. In particular, government data, released as Open Data (services), are one of the important applications.

To some extent, the Open-DAI outcomes depict a scenario which overcomes the ‘private’ vs ‘collective’ dichotomy often mentioned when discussing incentives to undertake innovative projects. In fact, as many other open source projects, this scenario encompasses both, since private incentives (by definition) for the maintainer and a collective action model (in principle) can be acknowledged at the same time. Open-DAI is foreseen to become an open source project, publicly available and maintained by one of the partners.

Under this scenario, benefits can be experienced at different levels, and in particular, (i) under the viewpoint of the Open-DAI project; (ii) for third parties reusing it; (iii) for the platform maintainer. In particular, Open-DAI continues existing and being available beyond its end as a  EU-funded project, distinguishing itself from the frequent case of European projects whose legacy (for the public as a whole) is hardly perceivable once the funding period has expired.

As a second set of potential benefits, third parties will have the chance to offer and distribute Open-DAI - as a stand-alone product, or embedded in a broader offering - even if none of the Open-DAI partners will decide to undertake this activity. Customisations and further developments will therefore be possible independent of the Open-DAI consortium and the willingness of its single participants. Moreover, since Open-DAI is a documented open source platform already used by PAs during the project, the adoption costs for interested PAs (or other kinds of organisations) is supposed to be reasonably low if compared with market offerings.

Finally, the platform maintainer can achieve a potentially high overall return, especially in terms of economies of scale and scope within its organisation.

Track record of sharing

A first direct experience of Open-DAI results sharing is represented by the development and deployment, within the project activities, of pilots which apply the technical outcomes. Five PAs, all Open-DAI consortium partners, are already offering innovative services to their citizens, and they have been used by their political representatives as a valuable communication tool.

1. Karlshamn Pilot

Karlshamn municipality in Sweden is opening up their point of interest data through the Open-DAI cloud infrastructure. As part of the pilot programme they are focussing on the placement and metadata of equipment in the municipality. Examples include lamp posts, waste bins and parking meters.

2. Barcelona Pilots

In Barcelona there are 2 different pilots inside the Open-DAI project focused on open and use environmental and mobility data. The goal is to increase the transparency of the city council by ensuring the availability of public data so that all of the stakeholders can use this information to create services and develop new applications. One of these applications is aiming at citizens and the second one is addressed to the Barcelona city council agents.

3. LLeida Pilots

The Lleida pilots are focused on informing about incidents that could affect mobility, especially mobility for disabled people, and to publish information that could be useful both to the citizens and the city tourists.

4. Ordu Pilots

In order for citizens and third-party developers’ access to daily information from Ordu Municipality, two pilot cases were deployed within the Open-Dai project. The pilot cases of Ordu aim to enhance services offered by the municipality and ease citizens’ lives by opening up city dynamics and POI information.

5. Regione Piemonte and CSI Piemonte pilots

There are four pilots in Piedmonte, featuring services related to air quality, transport, accidents and emergency calls.

A positive deviation affected the Piedmont region pilot corresponding to the accident collection smartphone application (Accident): the project released the source code with a permissive BSD open source license, so allowing access to a private company that managed the Emergency ICT infrastructure; this provider was to integrate the app with the emergency central system through available API - offered by the system itself.

Finally, the private company decided to evolve the smartphone application - integrating it into their solution and proposing it as an added value service to the Emergency.

The Emergency System is independent from Regione Piemonte, and this is what makes this decision a most important achievement for the exploitation of the project results. In fact, it is a good opportunity to grant for survival of the pilot beyond the project end, while identifying a maintainer of this part of Open-DAI. 

To increase accessibility and perception over the network of Open-DAI outcomes, a first-hand decision has been agreed to connect /integrate with the federated portal of HOMER EU Project (Harmonising Open data in the MEditerranean through better access and Reuse of public sector information), so using the catalogue implemented by HOMER to describe and index the data and information available through the Open-DAI APIs set. As a further interesting by-product of this implementation, multi-language will be automatically featured.

The adoption and use of the federated search engine of the HOMER project supports the goal of federating open data strategies of PAs in the Mediterranean area. Although only partially exploiting the actual key features of Open-DAI (e.g., real-time exposure of quickly changing data), this action is arguably an interesting example of integration with a front-end (actually, several front-ends, since several open data portals are involved), and of cooperation between EU-funded projects.

Lessons learnt

1. Well-harmonised organisation within the consortium comes first of all: it must be clearly understood and jointly shared among all partners that the overall project results are achievable only with the closest cooperation within the consortium = helpful support, comprehension for others' difficulties, not a setup of a competitive environment but rather an enhancement of information and experiences sharing

2. At setup of project overall plans, allow time and resources for a rolling process of knowledge growth to keep improving the project objectives, during the whole project life and in a step-by-step fashion. In our case, a significant source of information was represented by some outcomes of the project hackathons organised during Y2 by partners POLITO, Lleida Municipality and Netport. Those initiatives showed significant potential for Open-DAI, but also that thoroughly exposing and documenting its APIs, as well as further experimenting reuses of real time data exposure, is particularly important for our exploitation ambitions.

Applying this process model it is possible to really acquire benefits from feedbacks – coming from project activities and their partial results - already during the project lifespan.

3. Be as flexible as possible when planning and setting up the infrastructures, frameworks and operative environments for deployment of project developments outcomes. The project lifetime may be considered as a sort of evolving lab, to continuously investigate and better understand the most adequate requirements - in terms of processing capacity, but also network design and performance, operations maintenance and monitoring tools - that will apply at project outcomes deployment, in the final operations environment.

Scalability and flexibility in resources design and setup are crucial key concepts, as well as leaving the chance to adapt and even change the choices of resources and tools - supporting development and deployment stages.

Scope: International, Regional (sub-national)