Cheng - Forschungsprojekte

Mobilität

  • TraKuLa: Erfolgsfaktoren für chinesisch-deutsche Forschungskooperationen
    Kulturelle Hürden erschweren häufig die Zusammenarbeit in chinesisch-deutschen Projekten. Wie können die Partnerinnen und Partner voneinander lernen und ihre gemeinsame Arbeit erfolgreicher gestalten? Welche neuen Erkenntnisse entstehen durch transkulturelle Zusammenarbeit? Wie vollzieht sich in Deutschland und China der weitere Weg zur Innovation? Diesen Fragen widmet sich der neue interdisziplinäre Forschungsverbund TraKuLa (Transkultureller Lern- und Kompetenzansatz) an der Leibniz Universität Hannover, finanziert vom Niedersächsischen Ministerium für Wissenschaft und Kultur.
    Leitung: Monika Sester
    Team: Hao Cheng, Yu Feng
    Jahr: 2019
    Förderung: MWK
    Laufzeit: 2019-2022
  • Deep Learning von Verhalten im Straßenraum - speziell im Bereich Shared Spaces
    Im Projekt soll das Verhalten verschiedener Verkehrsteilnehmer in nicht regulierten, d.h. für alle Verkehrsteilnehmer offenen Räumen, untersucht werden. Existierende Ansätze gehen von einem gegebenen Bewegungsmodell aus, welches das individuelle Verhalten und auch das interaktive Verhalten unterschiedlicher Verkehrsteilnehmer beschreibt.
    Team: Cheng, Sester
    Jahr: 2018
    Förderung: DFG-Graduiertenkolleg SocialCars
    Laufzeit: 2014-2023

Masterarbeiten (abgeschlossen)

  • Design, Implementation and Evaluation of a New Machine Learning Approach for Behavior Prediction Based on LSTM and KDE
    Aktuelle Fahrzeuge sind bereits heute häufig mit Fahrerassistenzsystemen ausgestattet, die den Fahrer bis zu einem gewissen Grad entlasten und zu mehr Komfort beitragen. Im Kontext des automatisierten Fahrens und insbesondere in Fällen, in denen das automatisierte Fahrzeug unkomfortable oder riskante Situationen antizipieren muss, sind jedoch einige Einschränkungen zu beobachten. Aus diesem Grund, haben zahlreiche Arbeiten vorgeschlagen, automatisierte Fahrsysteme mit einem Prädiktionsmechanismus zu erweitern, der in der Lage ist, das Fahrverhalten der umgebenden Fahrzeuge vorherzusagen. So wurden beispielsweise probabilistische Positionsprädiktionsansätze entwickelt und vorgeschlagen. Somit und mithilfe dieser Prädiktionen können automatisierte Fahrzeuge ihre zukünftigen Trajektorien so planen, dass sie für die Insassen angenehmer und sicherer sind. In dieser Arbeit, wird der oben beschriebene Ansatz, zur Prädiktion umgebender Fahrzeuge, erweitert und weiter untersucht. Zunächst wird ein auf dem Long Short-Term Memory (LSTM) basierendes Prädiktionsmodul, das in einer früheren Arbeit vorgestellt wurde, integriert. Dieses Prädiktionsmodul schätzt die Zeit, bis ein umgebendes Fahrzeug die Fahrspur wechselt. Diese Information wird dann verwendet, um den Basis-Vorhersageansatz (als Baseline MOE bezeichnet) zu erweitern (als Extended MOE bezeichnet). Die Experimente zeigen, dass die Integration der geschätzten Zeiten von Vorteil ist. Zur weiteren Evaluierung des erweiterten Modells werden verschiedene Strategien zu Feature-Kombination erforscht. Insgesamt werden fünf verschiedene Varianten untersucht. Das Extended MOE zeigt leichte Vorteile gegenüber den anderen Modellen mit einem Medianfehler von 0,19 m bei einem Prädiktionshorizont von 5 Sekunden. Im letzten Teil dieser Arbeit wird ein nicht-parametrischer Ansatz, der Kernel Density Estimation (KDE), zur Approximation der Wahrscheinlichkeitsdichte verwendet. Dies dient zum Vergleich mit dem parametrischen Ansatz (der Variational Bayesian Gaussian Mixture Model (VBGMM)) verwendet in dem Basis-Model. Dieser neue KDE-basierte Ansatz wird als Local KDE-Ansatz bezeichnet. Die Idee besteht darin, dass eine lokale bzw. spezifische nicht-parametrische Wahrscheinlichkeitsdichte für jede Vorhersage geschätzt wird. Die experimentellen Ergebnisse zeigen, dass der KDE-basierte Ansatz vielversprechende Ergebnisse liefert, allerdings mit dem Nachteil, dass die Inferenzzeit sich erheblich erhöht. Dies bleibt ein Hindernis für Echtzeitanwendungen.
    Leitung: Hao Cheng, Philipp Otto
    Team: Ahmed Khliaa
    Jahr: 2021
  • A Generative Model with a Mixture Density Network for Trajectory Prediction
    Using real-world data to predict road users' trajectories in shared space is critical for many intelligent systems. The related methods are also applicable in some industrial scenarios, e.g., mobile robotics in warehouses and factories. However, the road user's movement is affected by the behaviors of its neighboring agents in different environments, so their dynamic is complex and uncertain. Therefore, it is full of challenges to effectively and accurately predict the future trajectories of each agent. Based on the generative model Dynamic Context Encoder Network (DCENet), a mixture density network replaces its decoder. Rather than output the trajectories directly, the mixture density network outputs a mixture distribution of the potential future trajectories by using the network parameters. The framework's performance is evaluated using a challenging trajectory prediction benchmark called Trajnet. Compared to DCENet, in this thesis, the novel framework with a mixture density network makes predictions by sampling from the distribution with the maximum mixture weights, thus increasing the chance to choose the most likely future trajectory.
    Leitung: Hao Cheng, Monika Sester, Annika Raatz
    Team: Feng He
    Jahr: 2021
  • Planning Highway Vehicle Trajectories Using a Generative Approach with Interaction Modeling
    Highways are full of dynamic situations, such as high-speed traffic and unexpected lane changes. A driver needs to maintain close attention to the neighboring vehicles when driving on the highway, which may cause fatigue after long driving. Accurately predicting future trajectories of surrounding vehicles will be very helpful for drivers to improve their safety and comfort. This thesis provides an interactive model based on a Conditional Variational Autoencoder that predicts the future trajectory, potential lane changes, and braking situations of surrounding vehicles. The proposed model provides the following contributions: At the training phase, the future trajectory of the central ego vehicle is given to the model. Then we combine this future trajectory with a grid-based interactive module that considers the relative motion of the neighboring vehicles. Finally, we use the combined information to predict the probability distribution of the neighboring vehicles' driving maneuvers and trajectories. At the inference phase, the trained model cannot access the future trajectory of the ego vehicle. Only a predicted trajectory using the ego vehicle's past trajectory is provided to the model. We evaluate the model's performance on two highway datasets, HighD and NGSIM. The errors measured by root mean square error (RMSE) are 1.52 m and 5.55 m, respectively, on HighD and NGSIM, slightly worse than the state-of-the-art model PiP.
    Leitung: Hao Cheng, Bodo Rosenhahn, Monika Sester
    Team: Mengran Fan
    Jahr: 2021
  • Trajectory Forecasting with Semantic Scene Information
    Trajectory prediction, an integral part of the autonomous driving system, has been researched for decades. A multi-path forecast provides many paths compared to a single-path forecast in complex traffic situations. For the multi-path prediction task, three kinds of information are generally required: 1) ego-motion information, 2) dynamic interaction information, and 3) scene information. Dynamic information represents the interaction between a target agent and other agents. Scene information represents the environment information that constrains the movement of road users. However, scenario-specific information can also easily make a model too adaptive to a particular scenario and jeopardize the model's generalization ability to new scenes. The objective of the thesis is to validate whether a proposed scene branch with semantic maps, a kind of scene information, performs well for multiple trajectories prediction on the inD benchmark. The scene branch is added to DCENet, which is based on a Conditional Variational Autoencoder and has superior performance on the inD benchmark. his thesis uses DCENet as the baseline model. Instead of specific scene information, semantic maps are taken as input to extend the DCENet model to avoid the overfitting problem caused by certain scene information. Semantic maps consist of eight channels that indicate different areas in the scenes. Moreover, the visual attention module is incorporated into the scene branch to extract scene contexts. The visual-attention module contains single-source and multi-source attention modules used to extract local and global scene contexts, respectively. In addition, two architecture variations of the scene branch are proposed. In the two architecture variations, semantic maps are first processed by a convolutional neural network (CNN)-based backbone. The first architecture merges the feature maps from the CNN with the visual-attention module and a Long Short-Term Memory (LSTM) module. The second architecture has two stages of processing the feature maps from the CNN. The first stage is to adopt a depth separable convolutional layer to merge the feature maps linearly. The second stage is to pass the combined feature maps into the visual-attention module and an LSTM module. The results tested on the inD benchmark show that the second architecture variation with two-time steps of LSTM is the best among all the performances of the two architecture variations. However, the experimental results from two architecture variations of the scene branch are not as good as the baseline model DCENet. The inferior results might be due to the imbalance of the data diversity across the inD benchmark and the lack of details in semantic maps, even though data balance and augmentation are processed.
    Leitung: Hao Cheng, Bodo Rosenhahn, Monika Sester
    Team: Fengxiang Hu
    Jahr: 2021
  • Traffic Control Recognition Using Speed-Profiles and Image Data
    Intersection regulator rules are one of the most important elements influencing traffic flow and route choice. As the road networks become increasingly complex, commuters often need up-to-date road traffic conditions from digital maps. Given that using surveying and mapping personnel to inspect and record the latest information of intersection regulations and keep the digital maps updated is highly time-consuming and cost-expensive, therefore a large body of research seeks an automated way for intersection regulator detection. Recently, studies have been using GPS tracks and intersection maps for regulator detection and have achieved good results. This thesis designed a generative model based on a Conditional Variational Autoencoder (CVAE) to predict intersection regulations. To fully extract the temporal features of GPS signals, we use a self-attention mechanism in our network. We assume that the satellite images contain the information of, e.g., intersection geometry and environmental scene contexts, which is beneficial for regulator detection at the intersection. Then, we seek to combine both GPS tracks with the high-resolution satellite images extracted from Google Maps using the coordinates of the given intersection. We have designed three feature combination methods for our network to effectively integrate image features with GPS signals. We conducted experiments on the four networks we designed and compared them to a state-of-the-art method based on a CVAE model. The results show that, on average, our network with a self-attention mechanism is 3% higher than that of the CVAE model measured by the F1 score. It proves that our network can fully extract the temporal features of GPS signals. The results of the other three models, which combine GPS tracks with high-resolution satellite images, are not satisfactory. To find the problem, we designed multiple sets of dropout rates to verify the effectiveness of the image feature extraction network. The results show that the network structure of the image feature extraction we selected cannot effectively extract the features for intersection regulations prediction. We need to explore new network structures to extract image features in the future.
    Leitung: Hao Cheng, Markus Fidler, Monika Sester
    Team: Haoran Lei
    Jahr: 2021
  • Predicting Interaction Between Vehicles and Vulnerable Road Users at a Right-Turn Intersection
    In real-world traffic situations, interactions between vehicles and vulnerable road users (VRUs) frequently occur, and injuries caused by vehicles to VRUs account for a large proportion of traffic accidents. Therefore, research on the interaction between vehicles and VRUs is important for traffic safety. This thesis aims to automatically classify the level of risks in the interactions between a target vehicle and the involved VRUs. A Conditional Variational Autoencoder (CVAE)-based deep generative model uses vehicles’ position, velocity, and orientation to predict the future trajectory of the vehicles. Then, potential interactions between the vehicles and VRUs based on the predicted trajectories are classified into five risk levels—collision, serious conflict, slight conflict, potential conflict, and undisturbed passage. A stationary camera collects the traffic data at a busy intersection in Germany, called KoW. This thesis uses the camera calibration method to transform the view angle of the video data. It obtains the position of the road users in the real world by the deep learning model YOLOv4 for object detection and DeepSORT for tracking. The intersection area is divided into a preparation area and a conflict area through the road markings. The trajectory of the approaching vehicle in the preparation area is used as the input data of the prediction model for predicting its trajectory in the conflict area. The model achieved an average accuracy of 90% for the risk classification task on KoW dataset. Another dataset called AIM is leveraged to verify the generality of the model. The model reached the classification performance with an average accuracy of 97%.
    Leitung: Hao Cheng, Bodo Rosenhahn, Monika Sester
    Team: Zhentao Jian
    Jahr: 2021
  • Interaction Classification Between Vehicles and Vulnerable Road Users at a Right-Turn Intersection
    Automatic interaction classification between vehicles and vulnerable road users at intersections is critical for traffic safety and autonomous driving. To this end, a Conditional Variational Auto-Encoder (CVAE) classifier that uses motion information captured by applying dense optical flow and object information extracted by a state-of-the-art object detector is proposed in this thesis. Built on the CVAE system, this thesis uses convolutional and recurrent neural networks for learning spatiotemporal features from the motion and object information. In order to train the classifier, a large real-world dataset is manually labeled from traffic video recordings collected at a busy intersection. In addition, this thesis applies a self-attention mechanism to enable the model to learn the weights between frame-level probabilities, which enhances the performance of the classifier. Furthermore, a sequence-to-sequence model is taken as the baseline model. Compared with the baseline model, the empirical results of the CVAE model using padding method with attention mechanism demonstrate the highest classification accuracy and the least false negative detections.
    Leitung: Bodo Rosenhahn, Monika Sester, Hao Cheng
    Team: Li Feng
    Jahr: 2020
  • Multi-Path Prediction of Mixed Traffic Trajectories in Shared Spaces
    In shared spaces, road signs, signals, and markings are removed to allow mixed traffic directly interact with each other. The traffic engineer Reid defined it as a street encouraging pedestrian movement and reducing the dominance of vehicles without explicit traffic rules. All users have to follow informal social protocols and negotiation to use the road resources, and avoid any potential collisions. The lack of regulations makes interactions between multimodal road users more complex compared with conventional designs. With the availability of large scale datasets and the development of deep learning techniques in sequence modeling and prediction, deep learning approaches are widely used for trajectory prediction.
    Leitung: Hao Cheng, Prof. Sester and Prof. Fidler
    Team: Xinlong Han
    Jahr: 2019
  • Scene Context-Aware Trajectory Prediction in Shared Space
    In shared spaces, road signs, signals, and markings are removed to allow mixed traffic directly interact with each other. At a micro level, understanding how they behave and how we can foresee their behavior after a short observation time are crucial to intent detection and autonomous driving, and traffic management in shared spaces.
    Leitung: Hao Cheng, Prof. Sester and Prof. Fidler
    Team: Rui Liu
    Jahr: 2019
  • Residual Learning for Mixed Traffic Prediction in Shared Space
    In recent years, with the increased availability of computational power and large-scale datasets, data--driving approaches, especially Deep Learning approaches, have been largely used for trajectory modeling. Nevertheless, predicting mixed traffic trajectories in shared space is not trivial.
    Leitung: Hao Cheng, Prof. Sester and Prof. Fidler
    Team: Yuhao Zhang
    Jahr: 2019
  • A Study of State-of-the-Art DL Methods for Mixed Traffic Trajectory Prediction
    In recent years, with the increased availability of computational power and large-scale datasets, data-driving approaches, especially Deep Learning (DL) approaches, have been largely used for trajectory modeling. The performance for pedestrian trajectory prediction in crowded spaces has been improved year by year, such as the state-of-the-art Social-LSTM (Alahi et al., 2016) CVAE (Lee et al., 2017), and Social-GAN (Gupta et al., 2018). The goal of this master thesis is to apply such stat-of-the-art DL approaches in a more challenging environment—shared space—for trajectory prediction with mixed traffic agents and compare their performance.
    Leitung: Hao Cheng, Prof. Sester and Prof. Fidler
    Team: Xin Xu
    Jahr: 2019