End-to-End Autonomous Driving

Perceiving the environment and its changes over time corresponds to two fundamental yet heterogeneous types of information: semantics and motion. Previous end-to-end autonomous driving works represent both types of information in a single feature vector. However, including motion-related tasks impairs detection and tracking performance due to negative transfer. 

To address this, we propose DMAD, a novel method that separates semantic and motion learning using parallel queries to mitigate negative transfer. Moreover, it merges similar tasks to facilitate positive transfer. Experiments on nuScenes confirm that our approach significantly boosts performance across perception, prediction, and planning.