APARSEN: Alliance Permanent Access to the Records of Science in Europe Network (APARSEN)

Published on: 25/04/2013
Document

APARSEN is a Network of Excellence that brings together a diverse set of practitioner organisations and researchers in order to bring coherence, cohesion and continuity to research into barriers to the long-term accessibility and usability of digital information and data, exploiting our diversity by building a long-lived Virtual Centre of Excellence in Digital Preservation.

The objective of this project may be simply stated, namely to look across the excellent work in digital preservation, which is carried out in Europe and try to bring it together under a common vision. The success of the project will be seen in the subsequent coherence and general direction of travel of research in digital preservation, with an agreed way of evaluating it and the existence of an internationally recognised Virtual Centre of Excellence.

The research leading to these results has received funding from the European Community’s Seventh Framework Programme FP7/2007-2013 – ICT-2009.4.1: Digital Libraries and Digital Preservation – under grant agreement No 269977. The UK Science and Technology Facilities Council (STFC) coordinates the project.

Policy Context

APARSEN’s fundamental driving force is the recognition of the widespread importance of digital data and the need for its preservation for long-term access and reuse. Such data is highly diverse and originates in many domains: science, cultural heritage, government, performing arts. Neelie Kroes, Vice-President of the European Commission responsible for the Digital Agenda, has remarked that "data is the new gold". The “Riding the Wave” report of the High Level Expert Group on Scientific Data, submitted to the European Commission in October 2010, examined how Europe can gain from the rising tide of scientific data. The report sets out scenarios and a list of actions to take advantage of the opportunities, and avoid the threats that put at risk our ability to capitalise on data resources in the long-term.

The threats to access and reuse are widely recognised and include:

  1. Users may be unable to understand or use the data;
  2. Non-maintainability of essential hardware, software or support environment may make the information inaccessible;
  3. The chain of evidence may be lost and there may be lack of certainty of provenance or authenticity;
  4. Access and use restrictions may not be respected in the future;
  5. Loss of ability to identify the location of data;
  6. The current custodian of the data, whether an organisation or project, may cease to exist at some point in the future;
  7. The ones we trust to look after the digital holdings may let us down.

(List taken from the PARSE. Insight study)

A key policy document is the EC Recommendation on Access to and Preservation of Scientific Information (July 2012). It states: “Preservation of scientific research results is in the public interest. It has traditionally been under the responsibility of libraries, especially national legal deposit libraries. The volume of research results generated is growing tremendously. Mechanisms, infrastructures and software solutions should be in place to enable long-term preservation of research results in digital form. Sustainable funding for preservation is crucial as curation costs for digitised content are still relatively high. Given the importance of preservation for the future use of research results, the establishment or reinforcement of policies in this area should be recommended to Member States.” Furthermore, “Solid eInfrastructures underpinning the scientific information system will improve access to scientific information and the long-term preservation of it. This can boost collaborative research.”

As the above Recommendation makes clear, there is a strong link to the drive for openness in data and publications in scientific research. Benefitting from long-term preservation requires that the digital resources are to some extent open to access by present and future users. This drive for openness is partly motivated by a societal imperative that the results of publicly-funded research should be publicly available, and is enabled by new technologies.

At national level, there are similar policies. In the UK, for example, RCUK (the umbrella group of the research funding organisations) has a set of common principles on data policy, which includes the declaration that “Data with acknowledged long-term value should be preserved and remain accessible and usable for future research.” The British government, like many others, is pushing data into the public domain through its data.gov.uk website, part of the agenda of transparency.

These policies cannot become reality without an eInfrastructure that can support them. Assured preservation requires secure storage solutions, persistent identifiers for digital resources, understanding of the costs of preservation, ability to trust repositories over the long-term. Therefore, the aim of APARSEN to contribute to building the eInfrastructure by defragmenting the current efforts in digital preservation, unifying them around a common vision and embodied in a sustainable Virtual Centre of Excellence in digital preservation.

Description of target users and groups

It is important to distinguish the ultimate beneficiaries of digital preservation - these will be end-users of repositories and archives - from the direct users of the technologies that APARSEN is developing and promoting. These latter users are those responsible for data holdings, ensuring their long-term integrity, accessibility and reusability. They might work within traditional memory institutions like libraries and archives, or operate repositories of scientific data, or have responsibilities for data preservation as part of their role in other organisations. These will be the people who will select and implement appropriate technologies, guided perhaps by the results or training courses offered by APARSEN. Their backgrounds will be varied, according to the domain of the data for which they are responsible; they might have librarian training, or come from a publishing background, or have started as scientists in the field in question - the profession of “data scientist” is still in its infancy.

The ultimate beneficiaries will be those who find they can access, understand and reuse digital data without barriers. It is impossible to speak of a single target group, for it depends on the nature of the data and the interests of potential users. In highly specialised technical fields, data is likely to be of interest only to members of that community; for data of more general interest, including for example data on diseases and outcomes and on social statistics linked to geographical areas, the potential users could encompass almost the whole population. In any case it is only the possibility of assured long-term access that allows these users’ needs to be fulfilled.

Description of the way to implement the initiative

APARSEN aims to look across the excellent work in digital preservation that is carried out in Europe and try to bring it together under a common vision. This de-fragmentation of approaches will avoid waste of effort and allow more rapid development of effective solutions. In addition, APARSEN will help the Alliance for Permanent Access (APA) evolve into an internationally recognised virtual centre of excellence bringing together an even wider set of organisations.

APARSEN identified four core topics (Trust, Sustainability, Usability and Access), which are tackled in the four streams: Integration, Technical Research, Non-Technical Research and Sustainable Uptake throughout the live time of the project one after another.

More specifically, the Joint Programme of activities covers:

  • technical methods for preservation, access and most importantly re-use of data holdings over the whole lifecycle;
  • legal and economic issues including costs and governance issues as well as digital rights;
  • outreach within and outside the consortium to help to create a discipline of data curators with appropriate qualifications.

Image removed.

Technology solution

Digital preservation has been called “interoperability with the future”. This principle underlies all the work done within APARSEN. There is no single technological solution, but the choices will be conditioned by the findings of APARSEN in its endeavour of defragmenting the landscape of digital preservation. For example, the consortium includes some solution providers with their own technology solutions, and these will take account of the needs and constraints that emerge. APARSEN’s work on such areas as storage solutions and preservation services will inform the development of technology in future.

Standards have a key role to play in the network’s activities, as adherence to standards or development and promotion of standards is key to interoperability. APARSEN acknowledges the reality of multiple standards in some areas, for example the multiple existing systems for persistent identifiers. There is a dedicated work package on coordination of common standards, whose aim is to identify common standards -either existing or required new ones- which will enhance the accessibility of information via the interoperability of the systems managed by the partners and the community at large.

Technology choice: Proprietary technology, Standards-based technology, Mainly (or only) open standards

Main results, benefits and impacts

As explained above under the section on users, the impacts of the technologies of concern to APARSEN will be felt by large and diverse communities in many areas of life. The general expectations for the impact can be related to different classes of stakeholder as follows.

Impact within the life of the project:

  • Researchers in digital preservation will be brought together to present, argue, defend, and if necessary modify their ideas. The results will be delivered through project reports on the developing integrated vision, and more importantly on the direction of their work – although diversity should not be smothered. The wide network of experience and competence will likely confront with international standardisation bodies and workgroups aiming to bring original point of view and develop the activity worldwide.
  • Research data users associated with the consortium members will see an impact by being able to find access and use a wider variety of digital objects than previously possible. Ambassadors will be identified and trained in order to begin to embed digital preservation into everyday workflows within their organisations.
  • Information curators will be able to feed their requirements into the research roadmap for digital preservation.
  • Decision makers will develop a better narrative of the digital preservation process so that it is easier for them to “see the wood for the trees”, in other words to understand the overall context without having to understand the technical details of digital preservation.
  • Suppliers will see opportunities for market creation through the identification of standards, common services and tools.

Impact after the project, through the Virtual Centre of Excellence:

  • Researchers in digital preservation will have a common understanding of the digital preservation landscape, a common set of testing procedures and test data and a common repository of tools. There will still be competition in research ideas but there will be common general direction of travel.
  • Research data users outside the consortium will see an impact by being able to find access and use a wider variety of digital objects than previously possible.
  • Research data producers will increasingly find preservation being built in to the workflows used to create data.
  • Information curators will find it increasingly more affordable to preserve digitally encoded information. This will be reflected in their cost models and preservation plans.
  • Decision makers will have a consistent overview of digital preservation and a view on cost/benefits; this will enable them to take well founded decisions.
  • Suppliers will have created products for preservation and use of digital objects through the identification of standards, common services and tools.

Long-term impact:

  • Researchers in digital preservation will have a similar standing as any other mainstream academic areas of research such as Physics or Humanities, with large numbers of graduates in digital preservation supplying a large demand in commerce, culture and society.
  • Research data users will be able to access, understand and use the massive range of digital information available to them from across the globe and across time.
  • Information curators will have choice of well tested, robust, cost effective ways of preserving their digital holdings, and they will be confident in their ability to preserve their holdings.
  • Decision makers will have a better understanding of the costs and benefits of their data preservation responsibilities. They will be able to have comfort from the external certification of the repositories for which they are responsible.
  • Suppliers will be able to supply society with an interoperable set of tools to survive and swim in the tidal wave of data which threatens to engulf and drown them.

Return on investement

It is not appropriate to think of the results of APARSEN in terms of direct return on investment. The achievement of the common vision of digital preservation will have benefits in focussing and defragmenting research efforts and practice. It is the intention to create a Virtual Centre of Excellence that will live beyond the end of the project’s lifetime. The benefits, services, training, consultancy etc. offered by this organisation will also be a return on the initial investment, for a wider community than the APARSEN participants.

Return on investment

Return on investment: Not applicable / Not available

Track record of sharing

The whole purpose of APARSEN is to enable a wide range of actors in digital preservation to benefit and learn from the activities and findings of the network. The very fact of participation in APARSEN is a mechanism for exchange and transfer of expertise between the range of stakeholders represented in APARSEN. These stakeholders are classified as:

  • universities and research institutes;
  • vendors;
  • national libraries;
  • big science;
  • research data archives;
  • membership organisations industrial design/engineering.

For outreach, APARSEN has a whole stream of activity dedicated to “Spreading Excellence”. This includes work packages on:

  • external workshops, symposia and events;
  • formal qualifications;
  • training courses;
  • external communications and awareness raising;
  • liaison with other stakeholders.

In concrete terms, the following outputs are being produced:

  • brochures summarising the results of integration of various phases of the project (a brochure on trust has already been produced, and one on sustainability is under preparation);
  • newsletters reporting activity within the network;
  • a series of webinars with external participants;
  • many academic papers resulting from the work done.

Lessons learnt

As the APARSEN project is only half way through its lifetime, it is too early to identify definitive lessons learnt. However, on the understanding that further lessons will emerge over time, the following key realisations motivated the work of APARSEN on the topic of “trust”. Trust is fundamental to the working of society – in particular when it comes to unfamiliar digitally encoded information, especially when it has passed through several hands over a long period of time.

  • Has it been preserved properly?
  • Is it of high quality?
  • Has it been changed in some way?
  • Does the pointer take me to the right object?

APARSEN has collected, evaluated and developed the key answers to these questions, and is continuing to do so for the topics of sustainability, usability and access.

Scope: International, National, Pan-European