This tutorial explains how to create an electronic Document in XSD format using the Genericode to UBL NDR tool of Cranesoftwrights. We have used the tool to create a sample document called ‘Business Activity Registration Request’ using the ISA Core Vocabularies and the UBL Naming and Design Rules.
The Genericode to UBL NDR is an open-source package provided by Crane Softwrights available under the Modified BSD Licence. This package allows creating UBL 2.1 XSD Schemas and OASIS CVA (context/value association) files according to the UBL Naming and Design Rules. In 2012, the script was already used to produce the original XSD Schemas of the ISA Core Vocabularies. The input for the package is a UBL NDR 2.1 spreadsheet expressed using the OASIS Genericode standard. In order to create this OASIS Genericode file, we have used an OpenOffice UBL NDR 2.1 spreadsheet template and the open-source OpenOffice spreadsheet export to Genericode subset export filter that serializes the contents of the spreadsheet as a set of Genericode rows.
In this tutorial, we use the Genericode to UBL NDR package to create a new XSD Schema for the Business Activity Registration e-Document using the ISA Core Vocabularies and UBL as the main libraries of reusable elements.
In order to create the schema, we have used the files listed in the table below.
Information Requirement Model
Following the e-Document engineering method, we have analysed the goals and requirements for the pilot to create a Business Activity Registration Request. We have collected a set of information requirement models that specify the semantics we have to exchange using the new e-Document.
We have captured the information requirement model following the e-Document engineering method in a spreadsheet form containing the goals, scope, requirements and information models, and we will use that model to populate the Pilot e-Document spreadsheet, from where the XSD Schemas will be generated.
ISA Core Vocabularies
(caveat: this spreadsheet is not official)
OpenOffice UBL NDR 2.1 spreadsheet with the ISA Core Vocabularies contains reusable components for our e-Document. In our pilot, we will use the ISA Core Vocabularies as the common classes to be used in the final document for maximum reuse. The current pilot does not require additional classes to be created, therefore, there will be only a main document schema for the Business Activity Registration Request main e-Document and two XSD Schemas for the ISA Core Vocabulary aggregate and basic components.
The set of XSD Schemas from the UBL 2.1 Standard are used as the layout and as a set of reusable components to be used in the pilot.
OpenOffice UBL NDR 2.1 spreadsheet with the e-Document model and its aggregated components.
This spreadsheet has to contain two sheets, one with the e-Document model and another with the common components for the new e-Document.
The main document sheet contains data elements derived from the Information Requirement Model and the common components sheet contain the ISA Core Vocabularies.
STEP 1 - Create the spreadsheet with syntax bindings to the Core Vocabularies and UBL
The first step is to create the spreadsheet from which the e-Document format will be generated. The file has to be populated from the information requirement model created in previous stages of the methodology (see e-Document Engineering Methods - Template Activity_Registration).
In this step, we map information requirements to existing ISA Core Vocabulary elements or UBL aggregates when possible. If there are concepts neither in UBL library nor in the ISA Core Vocabularies, we can create them in the BusinessActivityRegistrationRequest common sheet. In our pilot, we will not need to create additional classes in the common sheet.
The Pilot e-Document file has to follow the UBL NDR metadata. The elements captured in UBL are based on the ebXML Core Component Technical Specification Version 2.01. Each sheet has the following columns:
- Component Name – The UBL Component name is derived from the Dictionary Entry Name according to the UBL Naming and Design Rules. This will be the name of the XML Tag.
- Dictionary Entry Name – Dictionary Entry Name is the unique official name of the Business Information Entity in the data dictionary. It is based on the ISO 11179.
- Object Class – Represents the logical data grouping or aggregation (in a logical data model) to which a Property belongs. Object Classes have explicit boundaries and meaning, and their Properties and behaviour follow the same rules.
Each Object Class is an ABIE. Object classes are also referred to as Re-usable Types. In UBL, a document type is also an ABIE, and this means that the Object Class for the Business Activity Registration Request e-Document will be the same for all properties of the e-Document.
- Property Term Qualifier – Property Term Qualifiers specialize or modify the Property Term. For example, when the BIE is used in another context. In our case, when reusing a class like the UBL Party class, we add a qualifier Requesting to specify the type of Party.
- Property Term – Property Term represents the distinguishing characteristic or Property of the Object Class and “shall occur naturally in the definition.” It is also known as an attribute. The combination of Object Class and its Property Term should give the basic semantic meaning of the item.
- Representation Term – Is an element of the name that describes the form in which the property is represented.
- Data Type – The data type distinguishes the lexical constraints on an item’s value, plus any supplemental pieces of distinguishing information. Unqualified data types in UBL are based on UN/CEFACT ebXML CCTS core component types.
- Associated Object Class – This is the object class at the other end of the association. It is an ABIE in this model. We associate properties to classes from other vocabularies such as the ISA Core Vocabularies or the UBL Common Library. For instance, we reuse the class CvBusiness from the ISA Core Vocabularies to identify “the legal entity requested for registration”, and the class Party from the UBL Common Library to identify the “Party requesting the registration”.
- Alternative Business Terms – Business Terms (optional) consists of one or more synonyms by which the Business Information Entity is commonly known and used in a specific Context. A Business Information Entity may have several Business Terms or synonyms. These may be used to map BIEs to a controlled vocabulary, to other vocabularies, or to labels for forms presentation.
Component Type – Following the CCTS there are three BIE Types:
- Basic BIE (BBIE),
- Associate BIE (ASBIE; “an association”), and
- Aggregate BIE (ABIE; “an aggregate”).
- Definition – This is the unique semantic business meaning of the Business Information Entity. We use the definitions described in the previous phase of the project.
- Cardinality – The cardinality of the element, defined as indicated in the information requirements model.
The e-Document sheet has to be filled in following the requirements from the Information Requirement Model:
- Defining the simple information requirements as properties of the new e-Document,
- Identifying reusable components from the ISA Core Vocabularies and UBL common aggregate library and creating new aggregates when needed.
In the process of populating the spreadsheet, some concepts from the information requirement model can be grouped and mapped to the ISA Core Vocabularies:
- For the Business Activity Registration Request information requirement model we have reused the Company Activity class from the ISA Core Vocabularies.
- For the Requesting Party, we have reused the UBL Party class.
- For the Business Legal Form information requirement we have reused the CvBusiness ISA Core Vocabulary class.
We have also reused the UBL Version and document metadata basic information entities such as the Customization Identifier and the Profile Identifier, commonly used in UBL Schemas.
STEP 2 - Setup the export filter in OpenOffice
The OpenOffice 3 file has to be exported to a Genericode file.
Install the open-source OpenOffice spreadsheet export to Genericode subset export filter that serializes the contents of the spreadsheet as a set of Genericode rows.
To install this export filter, refer to the Readme file provided with the package. Below there is a summary of the steps to install the filter:
2.1. Uninstall the installed version of these filters
It is recommended to first uninstall any old version of the filter before installing a new one. See the OpenOffice spreadsheet export to Genericode subset Readme to learn how to uninstall a filter.
2.2. Install the filter
Start OpenOffice 3 and open a new document or spreadsheet. Click the menu item "Tools / XML Filter Settings..." to get to the filter dialogue.
The filter is installed using the following procedure. Press the button "Open Package..." without regard for any existing filter that may happen to be selected:
Navigate to the directory in which you unzipped the distribution file and select Crane-gcExportSubset.jar to add the "UBL NDR to Genericode SimpleCodeList by Crane Softwrights Ltd" filter to your installation and report successful operation:
The filter is now successfully installed in your OpenOffice.
STEP 3 - Export a Genericode Subset file from OpenOffice
Once we have the correct spreadsheet model, we have to use the OpenOffice spreadsheet export to genericode subset filter from Crane Softwrights to produce a Genericode file from the OpenOffice document. Use the “File / Export …” funtion to open the export dialogue.
It is recommended to use the “.xml” extension for the exported file.
In our pilot, we have created the BusinessActivityRegistrationRequest-rows.xml file.
STEP 4 - Setup the configuration file
Now we have to properly setup the configuration file to run the XSD production script.
There is a configuration file and a launch file. We have created our own files to launch and run the production script.
- createbarr.bat / createbarr.sh – File to launch the generation process.
- config-barr.xml – File with the setup data to generate the XSD schemas.
4.1. Launching file
The createbarr batch file has the following commands:
echo ISA Programme additional documents...
java -jar ../saxon9he.jar -s:mod/BusinessActivityRegistrationRequest-Entities.gc -xsl:../Crane-ublndrChecker.xsl -o:junk.out configuration-uri=../config-barr.xml common-config-uri=../config-ubl-2.1.xml common-gc-uri=UBL-Entities-2.1-PRD2-fix.gc
if [ "$?" != "0" ]; then exit ; fi
echo ...building business activity registration request document...
java -jar ../saxon9he.jar -s:mod/BusinessActivityRegistrationRequest-Entities.gc -xsl:../Crane-gc2ublndr.xsl -o:junk.out qdt-as-cva=yes configuration-uri=../config-barr.xml common-config-uri=../config-ubl-2.1.xml common-gc-uri=UBL-Entities-2.1-PRD2-fix.gc aabie-prefix=barr
It uses the Java saxon9he.jar engine to build the XSD schema file using the BusinessActivityRegistrationRequest Genericode file and the config-barr.xml configuration file as inputs.
The Crane-ublndrChecker.xsl file is the XSLT file that checks that the UBL naming and design rules are properly applied into the Genericode file.
The config-ubl-2.1.xml and the UBL-Entities-2.1-PRD2-fix.gc files are used to locate the UBL common vocabulary.
The Crane-gc2ublndr.xsl file is the XSLT file that converts the Genericode file to a XSD Schema following the UBL naming and design rules.
The launching file starts checking the Genericode file provided as input and then creates the schema following the configuration and with the “barr” namespace prefix.
4.2. Configuration file
The configuration file is called config-barr.xml:
<?xml version="1.0" encoding="UTF-8"?>
Test copyright statement.
<dir name="xsd" runtime-name="xsdrt">
<file type="AABIE" name="BusinessActivityRegistrationRequest.xsd"
prefix="barr" sabie-prefix="cva" sbbie-prefix="cvc"
Library: Business Activity Registration Request document
Generated on: %z
<file type="SABIE" name="CoreVocabularyAggregateComponents.xsd"
Library: Core Vocabulary Common Aggregate Components
Generated on: %z
<file type="SBBIE" name="CoreVocabularyBasicComponents.xsd"
Library: Core Vocabulary Common Basic Components
Generated on: %z
The configuration file has the following sections:
- A copyright section where the copyright statement to be added in the XSD files can be defined.
- A documentation section with a structure following the CCTS UN/CEFACT documentation structure to create documented schemas.
- A directory section to describe the directories where the generated files have to be located.
- A file section repeated per each file that has to be generated. Each file instruction has the name of the file, its type, its namespace and the namespace prefix used in the Schema.
STEP 5 - Run the Genericode-to-UBL-NDR script
The last step consists on generating the XSD Schema itself.
The script has to be launched using the launch script.
When there are no errors in the checking phase, the script generates the following XSD Schemas:
- xsd/mydoc/BusinessActivityRegistrationRequest.xsd – Main document XSD Schema with comments following the CCTS.
- xsd/mydoc/CoreVocabularyAggregateComponents.xsd – Core Vocabulary XSD of aggregated components with comments following the CCTS.
- xsd/mydoc/CoreVocabularyBasicComponents.xsd – Core Vocabulary XSD of basic components with comments following the CCTS.
- xsdrt/mydoc/BusinessActivityRegistrationRequest.xsd - Main document XSD Schema without comments
- xsdrt/mydoc/CoreVocabularyAggregateComponents.xsd - Core Vocabulary XSD of aggregated components without comments
- xsdrt/mydoc/CoreVocabularyBasicComponents.xsd - Core Vocabulary XSD of basic components without comments
The XSD Schemas follow the UBL naming and design rules.
Nature of documentation: Technical report