AEGIS & the Big-Data.AI Summit

EU Project AEGIS at the Big-Data.AI Summit

Published on: 01/07/2019

The Big Data AI Summit (, organized by BITKOM, is Europe’s leading summit for artificial intelligence and big data. More than 8,000 visitors experienced the Big-Data.AI Summit 2019 on April 10th and 11th at the Station Berlin.

Christian Kaiser from VIF had the opportunity to present results of the AEGIS project on the first conference day to about 100 interested parties, mostly from industry. Thereby Christian provided insights into the development of two concrete data-driven services for intelligent mobility on the AEGIS platform. While the first data-driven service deals with the detection of road damage by analysing vehicle sensor data, the second service provides safety-relevant feedback to drivers. After his presentation, Christian held a series of discussions with representatives of the automotive industry who were interested in joint activities.

The abstract to Christian’s talk accepted by the programme committee of the Big-Data AI.Summit is attached below:

The automotive domain is changing from purely offering goods and associated services towards exploiting data- and analytics-driven services, too. Thereby new players from the big data ecosystem are entering the market to co-develop data-driven solutions together with classical players from the automotive industry.

In our talk, we will provide insights into the development of two concrete data-driven services: detecting road damage and individual driving styles. Both services were created using an analytics platform developed in the AEGIS Big Data project funded by the European Commission in the Horizon 2020 framework program. The AEGIS platform combines numerous components of the big data ecosystem including Hadoop, Spark, Hops, Jupyter, TensorFlow, Hyperledger, and Elastic Search. Besides offering integrated components for the development of data-driven applications, one of the project’s main goals is to provide easy and flexible integration of third-party data-sources.

The main data source of the presented data-driven services are unsurprisingly measurements of vehicle data, all of them recorded during normal vehicle operation on public roads. We collected the measurements using a self-developed, custom logging device, built using only cheap and widely available commodity hardware. Of course, these measurements need pre-processing before they can be put to use. Our contribution will highlight the various difficulties we encountered and how we overcame them. For example, it is inter alia necessary to detect the actual position of the logging device in the vehicle in order to align the measurements collected by accelerometer and gyroscope to the vehicle’s coordinate system.

The first data-driven service we will present deals with the detection of road damage. As the impact of road condition on driving safety is not neglectable, this service is targeted at road maintenance companies to better manage their road maintenance works. For this service all available measurement data is analyzed in its entirety. Within the data, we detect “rumbling events” which indicate a potential road damage. These events are then spatially aggregated and normalized by the number of measurements that cover the respective area. The result shows for each point of the map how many percent of measurements detected a “rumble event” at the respective position. Thus, it can be interpreted as the spatial probability density of road damage.

The second data-driven service provides safety-relevant feedback to drivers. The goal is to provide interested drivers with insights into their individual driving style, allowing benchmarks with other drivers to stimulate safer driving behavior. Again, we start by calculating events – but this time the events are not related to the road conditions but instead describe safety relevant situations like harsh braking or hard cornering. In a second step, the calculated event information is enriched with weather data. This process is not as straightforward as it may seem at first sight, as weather data is usually not available for every position and time. Typically, the weather is measured only at various locations (weather stations) every half hour or so; thus a complex merge strategy is required. Next, we compute a risk score which is based on events and weather for each trip and driver. The risk score indicates the driver’s relative risk and provides prompt feedback as well as the possibility to compare with other drivers.