Autonomous Vehicle @ Oxa | Former Team Principal @ aUToronto Self-Driving | MASc @ UofT | Vector AI Scholarship Recipient
🎉 Excited to share that the work we have been working on for the past two years got accepted to #ECCV2024, the European Conference on Computer Vision. Our work #JDT3D (joint 3D object detection and tracking) aims to address the gaps in the current #Transformer based end-to-end detection and tracking architecture for #autonomousdriving. When I started my master, I strongly believed that object detection shouldn’t be done just on single frames, but should leverage the past frames of detection as much as possible while performing object tracking jointly in an implicit way. In this work, we took inspiration from the #Transformer (the now famous architecture in natural language processing), adopted it to 3D #LiDAR-based computer vision, and proposed several key optimization and training techniques (track sampling augmentation, confidence-based query propagation) to bridge the performance gap with the traditional two-step data association approaches. This work wouldn’t have been possible without the diligence and preseverance of Brian Cheong who is still working on further improvements in this area, and the unwavering support and supervision from Professor Steven Lake Waslander. As a commitment to the research community, we are preparing to open-source our work, so stay tuned by following us at Toronto Robotics and Artificial Intelligence Laboratory. Check out the full paper in detaills linked below. Paper arxiv link: https://lnkd.in/eVEBWU7S #autonomousdriving #computervision #lidar #objectdetection #multiobjecttracking #robotics #artificialintelligence
🚀 Exciting Breakthrough in LiDAR-Based Joint Detection and Tracking! 🚀 We're thrilled to share our latest research from the University of Toronto that has been accepted to #ECCV2024: "JDT3D: Addressing the Gaps in LiDAR-Based Tracking-by-Attention" by Brian Cheong, Jiachen (Jason) Zhou and Steven Lake Waslander. In computer vision, approaches trained in an end-to-end manner have been shown to perform better than traditional pipeline-based methods. However, within LiDAR-based object tracking, tracking-by-detection continues to achieve state-of-the-art performance without learning both tasks jointly. In our work, we explore the potential reasons for this gap and propose techniques to leverage the advantages of joint detection and tracking. 🌟 Key Highlights: - Innovative Approach: We propose JDT3D, a novel LiDAR-based joint detection and tracking model that leverages transformer-based decoders to propagate object queries over time, implicitly performing object tracking without an association step at inference. - Enhanced Techniques: We introduce track sampling augmentation and confidence-based query propagation to bridge the performance gap between tracking-by-detection (TBD) and joint detection and tracking (JDT) methods. - Real-World Impact: Our model is trained and evaluated on the nuScenes dataset, showcasing significant improvements in tracking accuracy and robustness. Check out the full paper linked below and join us at #ECCV2024! Paper: https://lnkd.in/gbT4EStA Code: https://lnkd.in/gU385pcW #autonomousdriving #computervision #tracking #objecttracking #robotics #3DVision #transformers #deeplearning