Skip to main content
Owner
Biodiversity Information Standards (TDWG)
Non-Profit Organisation
Contact information

Scope

Primary biodiversity data is a general term for information documenting the planet’s biodiversity, where each record represents the existence of a particular organism at a given location at a point in time. These data are scattered throughout numerous collections and databases worldwide, making it difficult to find all information available on, for instance, a certain species or a particular region. Several international networks and initiatives share a vision of free and open access to these resources and are working together to connect these heterogeneous data sources.


The Access to Biological Collections Data (ABCD) Schema is a comprehensive standard for the access to and exchange of primary biodiversity data. The ABCD schema attempts to be comprehensive and highly structured, supporting data from a wide variety of databases. Parallel structures exist so that either (or both) atomized data and free-text can be accommodated. It is compatible with several existing data standards, versions 1.2 and 2.06 are currently in use in different biodiversity networks.

Needs

The variety of biodiversity information is huge. Apart from the fact that they can be stored in the different database management systems, using hand-tailored or off-the-shelf collection management systems, they are very heterogeneous by nature, differing in

  • the basis of record: Preserved or living specimens versus machine or human observations,
  • the taxonomic scope of the dataset: Data items for the realm differ inherently from those for zoological or bacterial/viral data,
  • the complexity of data items: Specimen in natural history collections usually have much richer data than observational records.

Features

One of the key features that distinguish ABCD from flat biodiversity data schemas is its hierarchical structure. Instead of simply storing a sequence of data items, ABCD clusters related items into semantic groups that relate to real-world objects or concepts. These elements are arranged in a tree that reflects the structure of the relationships of their corresponding real-world objects. Inherent to this hierarchy is the idea of repeatable elements: according to the cardinality of their real-world equivalents, elements can be singular or repeatable. This allows ABCD to store multiple values for elements that can have several real-world instances (e.g. taxonomic identifications of specimens).


Another feature of ABCD addresses the diversity of structures found in data sources. Variable atomization in ABCD allows information to be stored according to the breakdown of the original source data. For example, in a very simple collection database, the locality of a specimen’s gathering site might be stored in a simple free-text field. A more sophisticated database will store the locality separated into several fields – namely a country name, the country’s ISO code, continent, names of several administrative units, the site’s geographic coordinates, altitude, a description of the locality and optionally a comment. Both cases are covered in ABCD by providing both a single free-text element for locality as well as separate, partially repeatable elements for all the locality data items just described.


Access to Biological Collections Database consists of parts that are applicable for all types of source databases and of specialized parts for specific types. Amongst others, the general part includes metadata about the dataset as a whole, information on gathering site, gathering events and gathering agent(s), the result of identification event(s) as well as the identifier person(s), and references to digital representations of the specimen or observation. Specialized parts exist for specimen collections and observational databases, for culture and paleontological collections, botanical and zoological gardens, plant genetic resources, herbaria and mycological collections.


The specialized parts in ABCD and the elements allowing for variable atomization result in a rather complex schema. The current ABCD version consists of 677 elements holding the data of the individual collection item (excluding the structural container elements in the hierarchy), plus 72 elements with metadata about the entire dataset (contacts, IPR statements, etc.). In addition, there are 36 attributes to indicate the ‘‘preferred’’ item where multiple elements can be present (e.g. the currently accepted taxonomic identification). Where appropriate, language attributes (180) can be used to specify the language used in text elements. However, this is not at the expense of storage size; unused elements are removed from the ABCD document before it is sent out by the web service.


For data that do not fit any of the standard elements, ABCD features an extension slot that can hold a custom-defined extension schema. If a network shares information not covered by ABCD, a custom extension schema can be defined for these additional items. Data providers publishing such items will then use ABCD documents augmented with this extension. These documents can still be digested by any network unaware of the extension, so special interest networks do increase data provision to the global, more general infrastructure. Currently, three ABCD extensions are in use: the DNA extension is used in the GGBN network; the extension for geosciences (EFG) is used in the Geosciences Collection Access Service; and the Australian Virtual Herbarium defined the HISPID 5 schema (Herbarium Information Standards and Protocols for Interchange of Data) in order to align ABCD with the standard used for Australian herbaria.

Applications

ABCD and its extensions are currently used in nine biodiversity information networks:

Further Documentation

The most current version of the standard can be downloaded from: http://www.tdwg.org/standards/115/.


More details about ABCD can be found in this publication: http://dx.doi.org/10.1080/11263504.2012.740085.

Detailed information

Last update
Status
Completed
Release date

Moderation

Only facilitators and authors can create content.
Non moderated