Institute of Computer Science - Graduation Theses Registry

Completed theses (Submit your thesis) Graduation theses topics (Submit a thesis topic)

Multihead Attention Enhanced Memory Augmented Neural Network for Multimodal Trajectory Prediction

Name

Farhan Syakir

Abstract

Autonomous driving has gathered an increased interest over the last two decades. One of the problems in autonomous driving that the researchers are actively trying to solve is agent trajectory prediction. The trajectory prediction is the problem of predicting future trajectories of surrounding agents such as other cars, cyclists, pedestrians, and any other road users around an autonomous vehicle. Deep learning has shown promising results in tackling the problem. There are various deep learning approaches addressed to the problem, and one of the approaches is using Memory Augmented Neural Network (MANN) and multi-head attention layer. Memory augmented neural networks in multimodal trajectory prediction have been proposed in the literature to address trajectory prediction (in a model called memory augmented networks for multiple trajectory prediction or MANTRA), but they do not use multi-head attention layers. Meanwhile, multi-head attention layers have also been investigated in the literature but in different contexts within this research topic.
In this work we proposed two models which both employ multi-head attention layers to the memory augmented neural network model. We name the models Multihead Attention Enhanced MANN (MAEMANN) 1 and MAEMANN-2. Similar to MANTRA, MAEMANN uses AutoEncoder, Memory Controller, and iterative refinement module (IRM). While the AutoEncoder and Memory Controller is responsible for memory, the IRM compiles the output from the memory and input from the surrounding agents in the environment. The MAEMANN-1 uses the multi-head self-attention layer in the memory network to improve predicting future trajectory by giving attention to the multiple neighboring memories, while MAEMANN-2 uses the multi-head attention in IRM to improve perceiving surrounding agents. Our experimental results showed that both MAEMANNs (i.e. models 1 and 2) outperform the MANTRA model, when tested on the Kitti dataset, where we predict 4 seconds future trajectory given 2 seconds past. In the multimodal prediction where the number of modes is 5, the MAEMANN-1 improves the Final Displacement Error (FDE) and Average Displacement Error (ADE) at t = 4 seconds by 10.58 % and 9.24%. Meanwhile, for MAEMANN-2, the improvements for FDE and ADE are 14.39% and 13.47%.

Graduation Thesis language

English

Graduation Thesis type

Master - Computer Science

Supervisor(s)

Naveed Muhammad, Yar Muhammad

Defence year

2021

PDF Extras

UT Institute of Computer Science Graduation Theses Registry

Multihead Attention Enhanced Memory Augmented Neural Network for Multimodal Trajectory Prediction