research-article

Integrating Objects into Monocular SLAM: Line Based Category Specific Models

Authors:

Rishabh Khawad,

K. Madhava Krishna,

Brojeshwar BhowmickAuthors Info & Claims

ICVGIP '18: Proceedings of the 11th Indian Conference on Computer Vision, Graphics and Image Processing

Article No.: 80, Pages 1 - 9

https://doi.org/10.1145/3293353.3293434

Published: 03 May 2020 Publication History

Abstract

We propose a novel Line based parameterization for category specific CAD models. The proposed parameterization associates 3D category-specific CAD model and object under consideration using a dictionary based RANSAC method that uses object Viewpoints as prior and edges detected in the respective intensity image of the scene. The association problem is posed as a classical Geometry problem rather than being dataset driven, thus saving the time and labour that one invests in annotating dataset to train Keypoint Network[25, 26] for different category objects. Besides eliminating the need of dataset preparation, the approach also speeds up the entire process as this method processes the image only once for all objects, thus eliminating the need of invoking the network for every object in an image across all images. A 3D-2D edge association module followed by a resection algorithm for lines is used to recover object poses. The formulation optimizes for shape and pose of the object, thus aiding in recovering object 3D structure more accurately. Finally, a Factor Graph formulation is used to combine object poses with camera odometry to formulate a SLAM problem.

References

[1]

Sameer Agarwal and Keir Mierle. 2015. Olhers,âĂIJCeres solver,âĂİ. (2015).

[2]

Koray Çelik and Arun K Somani. 2013. Monocular vision SLAM for indoor aerial vehicles. Journal of electrical and computer engineering 2013 (2013), 4--1573.

Digital Library

[3]

Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, and others. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).

[4]

Younggun Cho and Ayoung Kim. 2017. Visibility enhancement for underwater visual SLAM based on underwater light scattering model. In Robotics and Automation (ICRA), 2017 IEEE International Conference on. IEEE, 710--717.

Digital Library

[5]

Siddharth Choudhary, Luca Carlone, Carlos Nieto, John Rogers, Zhen Liu, Henrik I Christensen, and Frank Dellaert. 2016. Multi Robot Object-Based SLAM. In International Symposium on Experimental Robotics. Springer, 729--741.

[6]

Marco Crocco, Cosimo Rubino, and Alessio Del Bue. 2016. Structure from motion with objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4141--4149.

[7]

F Dellaert and others. 2012. GTSAM. URL: https://borg.cc.gatech.edu (2012).

[8]

Frank Dellaert and others. 2012. Gtsam. URL: https://borg.cc.gatech.edu (2012).

[9]

Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. In European Conference on Computer Vision. Springer, 834--849.

[10]

Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303--338.

Digital Library

[11]

Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. 2010. Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence 32, 9 (2010), 1627--1645.

Digital Library

[12]

Christian Forster, Zichao Zhang, Michael Gassner, Manuel Werlberger, and Davide Scaramuzza. 2017. Svo: Semidirect visual odometry for monocular and multicamera systems. IEEE Transactions on Robotics 33, 2 (2017), 249--265.

Digital Library

[13]

Dorian Gálvez-López, Marta Salas, Juan D Tardós, and JMM Montiel. 2016. Realtime monocular object slam. Robotics and Autonomous Systems 75 (2016), 435--449.

Digital Library

[14]

Ian Jolliffe. 2011. Principal component analysis. In International encyclopedia of statistical science. Springer, 1094--1096.

[15]

Michael Kaess, Hordur Johannsson, Richard Roberts, Viorela Ila, John J Leonard, and Frank Dellaert. 2012. iSAM2: Incremental smoothing and mapping using the Bayes tree. The International Journal of Robotics Research 31, 2 (2012), 216--235.

Digital Library

[16]

Rainer Kümmerle, Giorgio Grisetti, Hauke Strasdat, Kurt Konolige, and Wolfram Burgard. 2011. g 2 o: A general framework for graph optimization. In Robotics and Automation (ICRA), 2011 IEEE International Conference on. IEEE, 3607--3613.

[17]

Abhijit Kundu, K Madhava Krishna, and CV Jawahar. 2011. Realtime multibody visual SLAM with a smoothly moving monocular camera. In Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2080--2087.

Digital Library

[18]

Henning Lategahn, Andreas Geiger, and Bernd Kitt. 2011. Visual SLAM for autonomous ground vehicles. In Robotics and Automation (ICRA), 2011 IEEE International Conference on. IEEE, 1732--1737.

[19]

Cindy Leung, Shoudong Huang, and Gamini Dissanayake. 2006. Active SLAM using model predictive control and attractor based exploration. In Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on. IEEE, 5026--5031.

[20]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In European conference on computer vision. Springer, 21--37.

[21]

Beipeng Mu, Shih-Yuan Liu, Liam Paull, John Leonard, and Jonathan P How. 2016. SLAM with objects using a nonparametric pose graph. In Intelligent Robots and Systems (IROS), 2016 IEEE/RSJ International Conference on. IEEE, 4602--4609.

Digital Library

[22]

Montiel J. M. M. Mur-Artal, Raúl and Juan D. Tardós. 2015. ORB-SLAM: a Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics 31, 5 (2015), 1147--1163.

Digital Library

[23]

Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics 31, 5 (2015), 1147--1163.

Digital Library

[24]

J Krishna Murthy, GV Sai Krishna, Falak Chhaya, and K Madhava Krishna. 2017. Reconstructing vehicles from a single image: Shape priors for road scene understanding. In Robotics and Automation (ICRA), 2017 IEEE International Conference on. IEEE, 724--731.

Digital Library

[25]

Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In European Conference on Computer Vision. Springer, 483--499.

[26]

Parv Parkhiya, Rishabh Khawad, J Krishna Murthy, Brojeshwar Bhowmick, and K Madhava Krishna. 2018. Constructing Category-Specific Models for Monocular Object-SLAM. arXiv preprint arXiv:1802.09292 (2018).

[27]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779--788.

[28]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99.

[29]

Renato F Salas-Moreno, Richard A Newcombe, Hauke Strasdat, Paul HJ Kelly, and Andrew J Davison. 2013. Slam++: Simultaneous localisation and mapping at the level of objects. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1352--1359.

Digital Library

[30]

Hao Su, Charles R Qi, Yangyan Li, and Leonidas J Guibas. 2015. Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. In Proceedings of the IEEE International Conference on Computer Vision. 2686--2694.

Digital Library

[31]

Niko Sünderhauf, Feras Dayoub, Sean McMahon, Markus Eich, Ben Upcroft, and Michael Milford. 2015. SLAM-Quo Vadis? In support of object oriented and semantic SLAM. (2015).

[32]

Shubham Tulsiani, Abhishek Kar, Joao Carreira, and Jitendra Malik. 2017. Learning category-specific deformable 3d models for object reconstruction. IEEE transactions on pattern analysis and machine intelligence 39, 4 (2017), 719--731.

Digital Library

[33]

Rafael Grompone Von Gioi, Jérémie Jakubowicz, Jean-Michel Morel, and Gregory Randall. 2012. LSD: a line segment detector. Image Processing On Line 2 (2012), 35--55.

[34]

Shichao Yang and Sebastian Scherer. 2018. CubeSLAM: Monocular 3D Object Detection and SLAM without Prior Models. arXiv preprint arXiv:1806.00557 (2018).

[35]

Rui Zhu, Chaoyang Wang, Chen-Hsuan Lin, Ziyan Wang, and Simon Lucey. 2018. Object-centric photometric bundle adjustment with deep shape prior. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 894--902.

Cited By

Asl Sabbaghian Hokmabadi IAi MEl-Sheimy N(2023)Shaped-Based Tightly Coupled IMU/Camera Object-Level SLAMSensors10.3390/s2318795823:18(7958)Online publication date: 18-Sep-2023
https://doi.org/10.3390/s23187958
Wu YZhang YZhu DDeng ZSun WChen XZhang J(2023)An Object SLAM Framework for Association, Mapping, and High-Level TasksIEEE Transactions on Robotics10.1109/TRO.2023.327318039:4(2912-2932)Online publication date: Aug-2023
https://doi.org/10.1109/TRO.2023.3273180
Yang XFan XWang JYin XQiu S(2020)Edge-based cover recognition and tracking method for an AR-aided aircraft inspection systemThe International Journal of Advanced Manufacturing Technology10.1007/s00170-020-06301-x111:11-12(3505-3518)Online publication date: 11-Nov-2020
https://doi.org/10.1007/s00170-020-06301-x
Show More Cited By

Index Terms

Integrating Objects into Monocular SLAM: Line Based Category Specific Models
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Shape inference
      2. Computer vision tasks
        Vision for robotics

Recommendations

A principled formulation of integrating objects in Monocular SLAM
AIR '19: Proceedings of the 2019 4th International Conference on Advances in Robotics

Monocular SLAM is a well-studied problem and has shown significant progress in recent years, but still, challenges remain in creating a rich semantic description of the scene. Feature-based visual SLAMs are vulnerable to erroneous pose estimates due to ...
Real-time monocular object SLAM

We present a real-time object-based SLAM system that leverages the largest object database to date. Our approach comprises two main components: (1) a monocular SLAM algorithm that exploits object rigidity constraints to improve the map and find its real ...
SQ-SLAM: Monocular Semantic SLAM Based on Superquadric Object Representation
Abstract
Object SLAM uses additional semantic information to detect and map objects in the scene, in order to improve the system’s perception and map representation capabilities. Previous methods often use quadrics and cuboids to represent objects, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICVGIP '18: Proceedings of the 11th Indian Conference on Computer Vision, Graphics and Image Processing

December 2018

659 pages

ISBN:9781450366151

DOI:10.1145/3293353

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 May 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Tata Consultancy Services

Conference

ICVGIP 2018

ICVGIP 2018: 11th Indian Conference on Computer Vision, Graphics and Image Processing

December 18 - 22, 2018

Hyderabad, India

Acceptance Rates

Overall Acceptance Rate 95 of 286 submissions, 33%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
76
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)3

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Asl Sabbaghian Hokmabadi IAi MEl-Sheimy N(2023)Shaped-Based Tightly Coupled IMU/Camera Object-Level SLAMSensors10.3390/s2318795823:18(7958)Online publication date: 18-Sep-2023
https://doi.org/10.3390/s23187958
Wu YZhang YZhu DDeng ZSun WChen XZhang J(2023)An Object SLAM Framework for Association, Mapping, and High-Level TasksIEEE Transactions on Robotics10.1109/TRO.2023.327318039:4(2912-2932)Online publication date: Aug-2023
https://doi.org/10.1109/TRO.2023.3273180
Yang XFan XWang JYin XQiu S(2020)Edge-based cover recognition and tracking method for an AR-aided aircraft inspection systemThe International Journal of Advanced Manufacturing Technology10.1007/s00170-020-06301-x111:11-12(3505-3518)Online publication date: 11-Nov-2020
https://doi.org/10.1007/s00170-020-06301-x
Pokale ADas DAggarwal ABhowmick BKrishna K(2019)A principled formulation of integrating objects in Monocular SLAMProceedings of the 2019 4th International Conference on Advances in Robotics10.1145/3352593.3352664(1-6)Online publication date: 2-Jul-2019
https://dl.acm.org/doi/10.1145/3352593.3352664

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents