🎉 Exciting news! 🎉 Congratulations to Evan Cook on graduating with a Master of Applied Science in Aerospace Engineering, specializing in Robotics, from the University of Toronto Institute for Aerospace Studies! Here at the Toronto Robotics and Artificial Intelligence Laboratory (TRAIL), Evan made significant contributions to the field of autonomous driving, focusing on out-of-distribution detection for open-world machine learning. His work has laid the groundwork for safer and more reliable self-driving vehicles. Evan, your outstanding accomplishments and enthusiasm for innovation leave a lasting mark on TRAIL. We are incredibly proud of you and wish you all the best as you embark on the next chapter of your journey at Zoox. #graduation #autonomousdriving #robotics #computervision
Toronto Robotics and Artificial Intelligence Laboratory
Research Services
Robotic perception and planning to better autonomous systems.
About us
Founded by Prof. Waslander in 2018, the TRAILab focuses on research in perception for robotics, including object detection and tracking, segmentation, localization and mapping.
- Website
-
https://www.trailab.utias.utoronto.ca/
External link for Toronto Robotics and Artificial Intelligence Laboratory
- Industry
- Research Services
- Company size
- 11-50 employees
- Headquarters
- Toronto
- Type
- Educational
Locations
-
Primary
Toronto, CA
Employees at Toronto Robotics and Artificial Intelligence Laboratory
Updates
-
6D pose estimation of textureless shiny objects has become an essential problem in many robotic applications. Many pose estimators require high-quality depth data, often measured by structured light cameras. However, when objects have shiny surfaces (e.g., metal parts), these cameras fail to sense complete depths from a single viewpoint due to the specular reflection, resulting in a significant drop in the final pose accuracy. We are thrilled to share that our latest research has been accepted to #IROS2024! 🎉 In our paper, "Active Pose Refinement for Textureless Shiny Objects using the Structured Light Camera," co-authored by Yang J., Jian Yao, and Steven Lake Waslander, we present a complete active vision framework for 6D object pose refinement and next-best view prediction to mitigate the aforementioned issue. 🌟 Key Highlights 🌟: - Innovative 6D Pose Refinement: Our approach is tailored for SLI cameras and includes estimating pixel depth uncertainties and integrating these estimates into our SDF-based pose refinement module. - Surface Reflection Model: We predict depth uncertainties for unseen viewpoints using a reflection model that recovers object reflection parameters with a differentiable renderer. - Active Vision System: By integrating our reflection model and pose refinement approach, we can predict the next-best view (NBV) for pose estimation through online rendering. Check out the full paper linked below and join us at #IROS2024! Paper: https://lnkd.in/gMTHZVcY A big thank you to Epson Canada for supporting this work! #poseestimation #robotics #sli #structuredlightcamera
-
🚀 Exciting Breakthrough in LiDAR-Based Joint Detection and Tracking! 🚀 We're thrilled to share our latest research from the University of Toronto that has been accepted to #ECCV2024: "JDT3D: Addressing the Gaps in LiDAR-Based Tracking-by-Attention" by Brian Cheong, Jiachen (Jason) Zhou and Steven Lake Waslander. In computer vision, approaches trained in an end-to-end manner have been shown to perform better than traditional pipeline-based methods. However, within LiDAR-based object tracking, tracking-by-detection continues to achieve state-of-the-art performance without learning both tasks jointly. In our work, we explore the potential reasons for this gap and propose techniques to leverage the advantages of joint detection and tracking. 🌟 Key Highlights: - Innovative Approach: We propose JDT3D, a novel LiDAR-based joint detection and tracking model that leverages transformer-based decoders to propagate object queries over time, implicitly performing object tracking without an association step at inference. - Enhanced Techniques: We introduce track sampling augmentation and confidence-based query propagation to bridge the performance gap between tracking-by-detection (TBD) and joint detection and tracking (JDT) methods. - Real-World Impact: Our model is trained and evaluated on the nuScenes dataset, showcasing significant improvements in tracking accuracy and robustness. Check out the full paper linked below and join us at #ECCV2024! Paper: https://lnkd.in/gbT4EStA Code: https://lnkd.in/gU385pcW #autonomousdriving #computervision #tracking #objecttracking #robotics #3DVision #transformers #deeplearning
-
🎉 Big News! 🎉 Huge congratulations to Dr. Jun Yang (Yang J.) for successfully defending his doctoral thesis at the University of Toronto Institute for Aerospace Studies! During his time at the Toronto Robotics and Artificial Intelligence Laboratory (TRAIL), Jun made some amazing contributions to the field of object pose estimation for robotics. He authored groundbreaking papers on active perception for estimating 6D object poses, which were published in top-tier international venues such as #ICRA and #IROS. Jun, your incredible achievements and passion for innovation have left a lasting impact on TRAIL. We're super proud of you and can't wait to see what you accomplish next at Epson Canada! #graduation #robotics #computervision #poseestimation
-
Want to sit back and drive home by simply talking to the car? Wondering how to leverage Large Language Models for safe and smart autonomous driving? Check our #CVPR2024 paper “Lmdrive: Closed-loop end-to-end driving with large language models” by Hao Shao, Yuxuan Hu, Letian Wang, Steven Lake Waslander, Yu Liu, Hongsheng Li. This is the first work bringing LLM into closed-loop end-to-end autonomous driving! (with code released!) Abstract: Despite significant recent progress in the field of autonomous driving, modern methods still struggle and can incur serious accidents when encountering long-tail unforeseen events and challenging urban scenarios. On the one hand, large language models (LLM) have shown impressive reasoning capabilities that approach "Artificial General Intelligence". On the other hand, previous autonomous driving methods tend to rely on limited-format inputs (e.g. sensor data and navigation waypoints), restricting the vehicle's ability to understand language information and interact with humans. To this end, this paper introduces LMDrive, a novel language-guided, end-to-end, closed-loop autonomous driving framework. LMDrive uniquely processes and integrates multi-modal sensor data with natural language instructions, enabling interaction with humans and navigation software in realistic instructional settings. To facilitate further research in language-based closed-loop autonomous driving, we also publicly release the corresponding dataset which includes approximately 64K instruction-following data clips, and the LangAuto benchmark that tests the system's ability to handle complex instructions and challenging driving scenarios. Extensive closed-loop experiments are conducted to demonstrate LMDrive's effectiveness. To the best of our knowledge, we're the very first work to leverage LLMs for closed-loop end-to-end autonomous driving. Paper: https://lnkd.in/gDgfYcaa Project Website: https://lnkd.in/gWq2SUiH Code: https://lnkd.in/gmYqmJWr #cvpr2023 #autonomousdriving #autonomousvehicles #selfdrivingcars #reinforcementlearning #deeplearning #largelanguagemodel #LLM #foundationmodel #llava
-
Remember in connect-the-dots, where the more you look, the more you score? The same principle applies to motion prediction in autonomous driving, too! Check our #CVPR2024 paper “SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction” by Yang Zhou, Hao Shao, Letian Wang, Steven Lake Waslander, Hongsheng Li, Yu Liu. By this work, we outperform all published ensemble-free works on the Argoverse 2 leaderboard (single agent track) at the submission of the paper. Our key insight is that, motion prediction models confront various driving scenarios, and each comes with different difficulties, thus the refinement potential in different scenarios is not uniform. In this work, we introduce SmartRefine, a novel approach to refining motion predictions with minimal additional computation by leveraging scenario-specific properties and adaptive refinement iterations. Abstract: Predicting the future motion of surrounding agents is essential for autonomous vehicles (AVs) to operate safely in dynamic, human-robot-mixed environments. Context information, such as road maps and surrounding agents' states, provides crucial geometric and semantic information for motion behavior prediction. To this end, recent works explore two-stage prediction frameworks where coarse trajectories are first proposed, and then used to select critical context information for trajectory refinement. However, they either incur a large amount of computation or bring limited improvement, if not both. In this paper, we introduce a novel scenario-adaptive refinement strategy, named SmartRefine, to refine prediction with minimal additional computation. Specifically, SmartRefine can comprehensively adapt refinement configurations based on each scenario's properties, and smartly chooses the number of refinement iterations by introducing a quality score to measure the prediction quality and remaining refinement potential of each scenario. SmartRefine is designed as a generic and flexible approach that can be seamlessly integrated into most state-of-the-art motion prediction models. Experiments on Argoverse (1 & 2) show that our method consistently improves the prediction accuracy of multiple state-of-the-art prediction models. Specifically, by adding SmartRefine to QCNet, we outperform all published ensemble-free works on the Argoverse 2 leaderboard (single agent track) at submission. Comprehensive studies are also conducted to ablate design choices and explore the mechanism behind multi-iteration refinement. Paper: https://lnkd.in/g4SPxRDE Code: https://lnkd.in/g3YysfSH #CVPR2024 #autonomousdriving #autonomousvehicles #selfdrivingcars #reinforcementlearning #deeplearning #motionprediction
-
We are proud to announce that our paper "Multiple View Geometry Transformers for 3D Human Pose Estimation" by Ziwei Liao*, Jialiang Zhu* (朱嘉梁), Chunyu Wang (王春雨), Han Hu and Steven Lake Waslander, from University of Toronto and Microsoft Research Asia has been accepted to #CVPR2024! 🎉 In this paper, we aim to improve the 3D reasoning ability of Transformers in multi-view 3D human pose estimation. Recent works have focused on end-to-end learning-based transformer designs, which struggle to resolve geometric information accurately, particularly during occlusion. We propose a novel hybrid model, MVGFormer, which has a series of geometric and appearance modules organized in an iterative manner. Our method outperforms the state-of-the-art in both in-domain and out-of-domain settings. We will be attending #CVPR2024 in person. See you in Seattle, USA! More details are available here: Paper: https://lnkd.in/gnZBXGZE Code: https://lnkd.in/g4aStA5s (Available soon) #CVPR24 #3DHumanPose #Multiview #3DVision #Transformers #Deeplearning
-
Last but not least, our third #ICRA2024 paper (3/3): Modern robots navigating dynamic environments demand precise real-time detection and tracking of nearby objects. For 3D multi-object tracking, recent approaches process a single measurement frame recursively with greedy association and are prone to errors in ambiguous association decisions. In our paper, "SWTrack: Multiple Hypothesis Sliding Window 3D Multi-Object Tracking" by Sandro Papais, Robert(Junguang) Ren, and Steven Lake Waslander, we introduce Sliding Window Tracker (SWTrack) which yields more accurate association and state estimation by batch processing many frames of sensor data while being capable of running online in real-time. More details are available here: Paper: https://lnkd.in/gWNe9azv Video: https://lnkd.in/gJxcuuVt This concludes our series of #ICRA2024 papers. See you in Yokohoma, Japan! #ICRA24 #slidingwindows #tracking #objectdetection #autonomousdriving
-
Announcing our second #ICRA2024 paper (2/3): Numerous multi-object tracking methods blindly trust incoming object detections with no sense of their associated uncertainty. This lack of uncertainty awareness poses a problem in safety-critical tasks such as autonomous driving. To that end, we introduce UncertaintyTrack, a collection of extensions that can be applied to existing trackers to account for localization uncertainty estimates from probabilistic object detectors. Take a look at our #ICRA2024 paper: "UncertaintyTrack: Exploiting Detection and Localization Uncertainty in Multi-Object Tracking" by Chang Won (John) Lee and Steven Lake Waslander from the University of Toronto. Paper: https://lnkd.in/gzEW_8ZJ Video: https://lnkd.in/gG7aaktj #ICRA24 #uncertainty #robotics #autonomousdriving #objectdetection #tracking
-
We are proud to announce that 3 papers from our lab have been accepted to #ICRA2024 !🎉 First on our list, we have: 3D scene understanding is a fundamental problem for robotics. We propose a mapping system that builds a 3D map for the shape, pose, and their corresponding uncertainties of multiple objects in a scene. It is formulated as a probabilistic optimization framework that leverages a learnt generative model as shape priors. Take a look at our new #ICRA2024 paper “Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors” by Ziwei Liao*, Yang J.*, Jingxing (Joe) Qian*, Angela Schoellig and Steven Lake Waslander, from the University of Toronto and the Technical University of Munich. We will be attending #ICRA2024 in person in May, 2024. See you in Yokohama, Japan! More details are available: Paper: https://lnkd.in/gmPTuHee Video: https://lnkd.in/gBD83Wfh Code: https://lnkd.in/gHZV3_tZ (Available soon) #ICRA24 #uncertainty #robotics #3dvision #mapping #reconstruction #generativemodels