The Interoperability Test Bed has made available a reusable, generic service to validate Table Schema definitions.
Tabular text data such as CSV (Comma-Separated Values) files are largely used in processes such as bulk data ingestion, data migrations and reporting. Although the core requirements on a CSV file’s syntax are well-defined in RFC 4180, the specification does not address the content of individual fields. This gap is filled by the Table Schema specification, which focuses on defining expected fields in terms of their data types, formats and constraints, providing to CSV content what XML Schema offers to XML. Table Schema itself is a part of the Frictionless Specifications, a family of related and complementary specifications that aim to formalise exchanges of tabular data, facilitate the definition of validation libraries, tooling support and data automation processes.
A Table Schema instance is expressed in JSON identifying data fields and their expected content.
Before a Table Schema instance can be used to validate CSV data one needs to ensure that the schema itself is well-defined. This is where the Interoperability Test Bed comes in, by providing a public service to validate Table Schema instances against Table Schema’s own requirements (defined via JSON Schema). This service expects a Table Schema instance to be provided as input and returns a detailed validation report. It is available:
- Through a web user interface for use by developers of Table Schema instances.
- Through a SOAP API for machine-to-machine integration and use in conformance test cases in the Test Bed.
The Table Schema validator is itself based on the Test Bed’s generic JSON validation service, that validates the schema’s JSON content against the Table Schema requirements. Table Schema, as a means of formalising CSV data, is at the core of the Test Bed’s CSV validation service, that offers a configuration-driven approach to creating CSV validator instances. One such instance is the Test Bed’s generic CSV validator that validates arbitrary CSV data based on a user-provided Table Schema instance and optional syntax settings.
More information on Table Schema and the other Frictionless Specifications can be found on the Frictionless documentation portal. Additional information and resources on the Test Bed itself can be found on its Joinup space with its value proposition being a good starting point for newcomers. Finally, to remain updated of all the latest Test Bed news: