Skip to main content

Interview: A common model for personal data

Interview: A common model for…

Joinup Admin
Published on: 15/12/2009 News Archived

The SEMIC.EU repository contains several national data models for persons. The Clearing Process sets the scene for a European interoperability asset for all.

Josef El-Rayes of the Clearing Process team explains the idea behind this approach and why it is paradigmatic for the principle of harmonisation.

SEMIC.EU What is it that makes the issue of data describing a person so interesting?

Josef El-Rayes The information identifying a natural or physical person is one of the basic data sets in many areas of eGovernment and administration. It is not by chance that there are more implementations of data on persons than of anything else. Persons are citizens, after all. With regard to interoperability, we must find agreement on the most basic matters. There needs to be a common vocabulary to describe a person and its parameters (as far as they are relevant for administrative procedures).

With SEMIC.EU we?re in the fortunate position to have several relevant and interesting implementations of the same issue, and ? strikingly ? to have them available as assets that have reached a "mature" state. In the logic of SEMIC.EU this means that we identify a need for collaborative harmonisation. We are aiming at the development of a common European asset.

 

But does that mean that single countries must abandon their own individual models?

No. It is not legitimate to operate with coercion. SEMIC.EU doesn?t have a mandate for standardisation and promotes the idea of harmonisation. To be clear, this must not be confused with competition in the sense of benchmarking. The latter would eventually eliminate all but one solution. Quite to the contrary, we follow an approach of creating something new which allows subsuming existing solutions.

During the last meeting of SEMIC.EU?s Advisory Group it became apparent that no country is willing to abandon its own schema. Standardisation is clearly not on the agenda in this matter.

In persons, it is easier to find conformance in the form of a "European asset" that does not replace national assets but complement them. The current situation of several mature assets for the same data is a perfect example for SEMIC.EU?s Clearing Process. Everybody is invited to participate ? we encourage open discussion. Especially since this is not an expert issue.

We expect this approach to foster the acceptance of a common solution and to increase the probability of its actual use in administrative contexts.

 

How similar or dissimilar are the present models?

Generally speaking, there are only slight differences. They concern aspects like the semantics of names: Not all models know, for instance, the concept of a "middle name" or "father name". But there are no fundamental problems.

"Person" is a good example of a data model whose use is not a matter of the quality of the respective implementations. The new model must meet as many of each country?s specific requirements.

In other domains it is legitimate to speak of better or worse in terms of quality, especially where different technologies are concerned. This is clearly not the case with persons.

There are cases of far greater complexity than person models, criminal records being one of them. They can contain very different semantic data. Imagine that one country features the concept of "administrative offence" as opposed to "criminal offence" while another doesn?t provide for this distinction. Similarly, DNA profiles are recorded for all cases in the United Kingdom, whereas Germany keeps these records for felonies only.

 

But aren?t there still differences in persons between countries, e.g. with regard to gender?

This is a question of perspective: Differences in persons are solely based on considerations in data modelling whereas in other cases the information to be modelled is in itself not precisely congruent. The concept of a person is the same in Britain and in Germany.

British registries, however, know four codes for gender as opposed to three in Germany. Biologically, a German does not change, even if he or she moves to the United Kingdom. The description of their gender, might, however, vary.

Addresses are a different matter " they contain different elements and can even vary with regard to their function " what exactly do they specify: delivery to a building, to an apartment etc. In other words: the data model refers to a similar concept but not to the exact same one.

 

What will the eventual outcome of the harmonisation process be?

We must work towards finding the ?meta person? which reflects many peculiarities of European person models and thereby enables data exchange without loss of information. We must strive not to produce too much overhead in this process. Simply adding more data fields is not a viable solution.

In a nutshell, a "meta person" must be substantial enough to cover most of the information represented in European implementations of persons. At the same time, it must be as small as possible in order to allow for efficient data exchange. We are looking for the least common multiple.

A "core person" that contains the most important attributes is also conceivable. It would have to be exchangeable between all countries and have the capacity to be extended in particular cases. Except for the core attributes, all attributes would then be modelled separately and could complement the information needed in a concrete case.

The goal of the harmonisation process is a "conform" asset (consisting of the European schema and the respective mappings). All conform assets are recommended by SEMIC.EU for reuse.

 

Can you give any examples of the measures taken to create a common European model for persons?

One measure might be to introduce a generic additional name instead of a certain monastic name or artist?s name. The specification of the mappings is done via "semantic statements". Ontology-based mapping is usually the strategy of choice.

 

What are the next steps?

For the development of a useful common asset, statements and requirements from stakeholders in all domains of public administration are needed. Special cases are of greatest importance. Statements and cases should ideally be represented in the public forum to make the process as transparent as possible.

The mere engineering of the asset is rather straightforward. What is more demanding is meeting the specific requirements in all contexts that person models are used in.

SEMIC.EU will develop a draft for a European schema and corresponding mappings. The proposed draft will be discussed with asset owners and other interested parties.

 

For more information on the data models for personal data, see the Forum thread: Data model 'person': competing approaches