
Sander van Dooren has been a solutions architect at DG DIGIT. When he joined the Commission in 2015, he expected his stay to be barely three or four months. Eventually, he ended up for a period of seven years, in which his work as a talented solution manager took a major part. Now, he considers the time right to begin a new professional chapter at Digitaal Vlaanderen. But he doesn’t say goodbye before giving a final interview, reflecting on his work for Joinup, the Semantic Interoperability Community (SEMIC) and the musical nature of interoperability.
Q: After you worked on the performance of several Commission websites at the very beginning, you started to take part in Joinup, our platform for e-government professionals. How did you get involved in this?
Sander van Dooren: I became part of a team who needed to migrate the old version of Joinup to a new website, a complete rebuild of the software. This was not an easy task. One of the big problems was that it harvested catalogues of solutions of Member States that constantly needed to be updated. Many Member states had their own little ‘Joinup’-website, but only within the scope of their own national territory. And their catalogues needed to be translated from one format to another all the time to get the information integrated into Joinup. It was completely unmanageable. There was a tremendous amount of time spent on trying to find out where in all of these translation things went missing. Some bugs had ongoing correction attempts for years.
Q: How did you get around this complexity?
Sander van Dooren: By trying not to reinvent the wheel. We decided to use the same data model that the Member States already used to exchange information to each other for the internal data model for Joinup. And it was rather a tricky move, because it meant this new data model had to be similar to the existing one.
Furthermore, the technology that is being used for the data exchange, is based on the concept of linked data. Big companies like Amazon and Zalando are using linked data, but at the time there weren't a lot of existing solutions. Besides, Joinup already had a content management system that we wanted to keep. As a solution, we used a database called a triple store. This database can store linked data natively. So we stored linked data instead of tables in a relational database.
Q: To summarise, you tackled Joinup's translation problem in data exchange by combining linked data in a content management system. How common was this idea?
Sander van Dooren: It was a novel idea that we managed to sort of conceptualise. But it became very successful. Right now, the technology we built for Joinup is used in many Commission websites, especially in order to make the link to reference data, such as code lists like a list of countries, languages, etc.
Q: During your time with us, we noticed how passionate you are about working on semantics and talking about it to other colleagues. What is it that you love so much about it?
Sander van Dooren: Well, if you want errorless exchanges of data between IT-systems, it is really important that the systems attach the same meaning to the data. That is quite a challenge, because you experience that people regularly have a different understanding of what a word or concept means.
For example, how do we understand what a public service is? And which things can be categorised into a public service? Is garbage collection organised by a private company still a public service? You have to agree on those issues. So you have to define things, you have to classify knowledge of the world into certain categories and determine where a concept begins or ends. This is what we call defining ontologies. And it is what makes this conversation about semantic models so super interesting for me. Suddenly, talking about the garbage collection as a public service becomes philosophy (laughs).
Q: Defining meaning to almost everything must be an incredible amount of work. No wonder semantics can be so complex.
Sander van Dooren: Indeed! Imagine having a quick proper conversation by writing down first a shared definition of every word we're going to be using for this conversation. That would be quite impossible (laughs). But then again, it's a bit where the whole IT-industry thrives on. When you have a legacy application and there is a rebuild, everything about this data transformation is probably more than half of the cost of the project. Because you really have to understand how something was meant and then translate it into your new language.
Q: To which extent can interoperability overcome this problem?
Sander van Dooren: To achieve interoperability, you must at least be able to interpret how the other application interprets the data. And of course, the easiest way is to agree on the same definition. But you cannot always do that. If there are already years of work done on an existing application, you'll have to start with how concepts are understood within that system.
In the end, the most important thing is that you manage to achieve a common goal. And it's about this capability to achieve the goal together, despite the differences, what interoperability is all about.
Q: What would be your general vision of an Interoperable Europe? What should it look like in the future?
Sander van Dooren: That's definitely a difficult question, but I think the approach SEMIC is taking in this matter is quite a good one. It defines only the necessary core parts and doesn't over standardise it. You don't want to oblige every Member State to implement things in the same way. Europe is not the ‘United States of Europe’ or a nation by itself. There is the important principle of subsidiarity, and each Member State has its regional context. Overstandardising would therefore increase the political risk of feeding the perception that Europe decides everything.
That's why I think the concept of core vocabularies is really interesting. It allows us to solve the problems that need to be solved, like having all administrative burden sort of cleared out when people move cross-border and they need to request papers in another Member State.
Q: So you would prefer a more interoperable Europe instead of a more standardised Europe?
Sander van Dooren: Yes, I think so. I can read the ID card of a Spanish citizen in Belgium. But an ID-card does not necessarily have to function in an identical way.
For me, Interoperability is not like a classical orchestra in which the Commission is the conductor and dictates when and how each thing should happen. Interoperability is Jazz. It is a choreography where everyone is coordinating by looking at each other, while the Commission is sort of the facilitator of the choreography in many cross-border projects. This kind of coordination allows us to work together in an efficient way, without needing to have the same exact background.
"With semantics, talking about the garbage collection as a public service becomes philosophy"
Q: Looking back at your time with us, what would be the example of an improvement in semantic interoperability in the European public sector that got introduced thanks to our unit?
Sander van Dooren: The most popular specification that came from our unit is definitely DCAT-AP. Almost every government nowadays has an open data portal where you can find a spreadsheet of open data with the water quality of a river or the location of all the bus stops. DCAT-AP is a specification that helps to exchange this open data in an automatic way by describing the meaning of the data itself.
Q: Can you give an example on how DCAT-AP works?
Sander van Dooren: Imagine you have this ordinary spreadsheet with all locations of bus stops in your region. DCAT-AP gives you a way to describe a catalogue record of this dataset, so that people can find this data later on. For instance, you describe things like the title, business domain (mobility), geographical scope (e.g. Spain), publication date etc. of this data. This way, you can build a data portal, much like a library system where you can look up the available the books. These catalogues can then be connected together and you end up with a federation of catalogues. For instance, you can have a federation where data is bubbling up through the levels of administration from the city level, right up to regional, national and European level. So if you go to the European data portal, you’ll find the bus stops in Belgium and the bus stops in Bulgaria through the same portal.
Q: Now that we’re talking about mobility, the new professional chapter you’re opening has a lot to do with this domain too. What will you do?
Sander van Dooren: I'll be joining the unit of Digitaal Vlaanderen, where I’ll occupy myself with the ‘smart data space’. The ‘smart’ refers to the intention to move beyond static datasets, and publish real time data of ‘things’ in the wild.
One of the first sets we'll be publishing ‘real time’ is the one of ‘hindrances in public space’, which contains things like road works. Publishing data this way will allow both companies and the government to innovate.
The nature of the job is to bring the specifications that live within SEMIC into real life. That is why I am looking forward to it. For the ‘smart data space’, I’ll even be using ‘link data event streams’, a specification that was published last year by SEMIC. The first pioneering projects will be on mobility data and then on water quality. Then many other domains will hopefully follow.