We use latest sensor technologies for precisely measuring and reconstructing 3D scenes. The core of all our projects is a profound understanding of these technologies and our field-proven calibration, registration, real-time stereo matching and tracking techniques.


We employ a wide range of techniques like pruning, hashing, quantization or distillation to increase run-time performance while maintaining accuracy and robustness. All our in-cabin monitoring algorithms are optimized for running on low-power embedded architectures in real-time with minimal memory use and energy consumption.


We apply various methods for generating large data sets for training, evaluation and validation of our human analysis algorithms. We use annotated real data (e.g. semi-supervised) as well as synthetic data (computer graphics) generation methods.


Within the domain of computer vision, interest point detection in its common sense refers to the detection of points which are mathematically well defined and provide rich structure within their local neighborhood. These properties usually make sure that the interest points are stable against variations (e.g. viewpoint changes and / or structural changes in the observed scene) and therefore serve as meaningful input for higher level applications.

We developed a highly optimized machine learning framework for interest point detection where the precise mathematical formulation of the interest point based on the local neighborhood can be of any complexity. The point is not described by using the local neighborhood within a single image but by first applying a vast variation on the scene configuration itself and then learning the common properties of the point’s position over all these configurations. For each training image, feature pyramids are set up over different input image resolutions in order to increase the learning rate for semantic information around interest points while still preserving spatial resolution. Both information is needed for precise description and localization of interest points. In a final step, the prediction model is optimized for running it on edge processing devices. Therefore, the information, which contributes just little or nothing to the final prediction result is removed from the model. The remaining useful information is clustered and distilled (i.e. simpler models mimic much more complex models) in order to gain speed at inference stage. These measures enable fast inference while still providing highly accurate results.

The framework allows a very fast detection of almost any kind of interest point over a large variety of different image-based input data. It can be applied for detection as well as for pose estimation and tracking of various objects in the wild. It therefore serves as basis for several applications, where high robustness and real-time are required (e.g. within safety-critical systems). for single Levels.