skip to main content
10.5555/3061053.3061089guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Semi-supervised multimodal deep learning for RGB-D object recognition

Published: 09 July 2016 Publication History

Abstract

This paper studies the problem of RGB-D object recognition. Inspired by the great success of deep convolutional neural networks (DCNN) in AI, researchers have tried to apply it to improve the performance of RGB-D object recognition. However, DCNN always requires a large-scale annotated dataset to supervise its training. Manually labeling such a large RGB-D dataset is expensive and time consuming, which prevents DCNN from quickly promoting this research area. To address this problem, we propose a semi-supervised multimodal deep learning framework to train DCNN effectively based on very limited labeled data and massive unlabeled data. The core of our framework is a novel diversity preserving co-training algorithm, which can successfully guide DCNN to learn from the unlabeled RGB-D data by making full use of the complementary cues of the RGB and depth data in object representation. Experiments on the benchmark RGB-D dataset demonstrate that, with only 5% labeled training data, our approach achieves competitive performance for object recognition compared with those state-of-the-art results reported by fully-supervised methods.

References

[1]
Maria-Florina Balcan, Avrim Blum, and Ke Yang. Co-training and expansion: Towards bridging theory and practice. In NIPS , 2004.
[2]
Avrim Blum and Tom Mitchell. Combining labeled and unlabeled data with co-training. In COLT , 1998.
[3]
Manuel Blum, Jost Tobias Springenberg, Jan Wulfing, and Martin Riedmiller. A learned feature descriptor for object recognition in rgb-d data. In ICRA , 2012.
[4]
Liefeng Bo, Xiaofeng Ren, and Dieter Fox. Depth kernel descriptors for object recognition. In IROS , 2011.
[5]
Liefeng Bo, Xiaofeng Ren, and Dieter Fox. Hierarchical matching pursuit for image classification: architecture and fast algorithms. In NIPS , 2011.
[6]
Liefeng Bo, Xiaofeng Ren, and Dieter Fox. Unsupervised feature learning for rgb-d based object recognition. ISER, June , 2012.
[7]
Yanhua Cheng, Xin Zhao, Kaiqi Huang, and Tieniu Tan. Semi-supervised learning for rgb-d object recognition. In ICPR , 2014.
[8]
Yanhua Cheng, Rui Cai, Chi Zhang, Zhiwei Li, Xin Zhao, Kaiqi Huang, and Yong Rui. Query adaptive similarity measure for rgb-d object recognition. In ICCV , 2015.
[9]
Yanhua Cheng, Rui Cai, Xin Zhao, and Kaiqi Huang. Convolutional fisher kernels for rgb-d object recognition. In 3DV , 2015.
[10]
Yanhua Cheng, Xin Zhao, Kaiqi Huang, and Tieniu Tan. Semi-supervised learning and feature evaluation for rgb-d object recognition. Computer Vision and Image Understanding , 139:149-160, 2015.
[11]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR , 2009.
[12]
Andreas Eitel, Jost Tobias Springenberg, Luciano Spinello, Martin Riedmiller, and Wolfram Burgard. Multimodal deep learning for robust rgb-d object recognition. IROS , 2015.
[13]
Saurabh Gupta, Ross Girshick, Pablo Arbeláez, and Jitendra Malik. Learning rich features from rgb-d images for object detection and segmentation. In ECCV , 2014.
[14]
I-Hong Jhuo, Shenghua Gao, Liansheng Zhuang, DT Lee, and Yi Ma. Unsupervised feature learning for rgb-d image classification. In ACCV , 2015.
[15]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In NIPS , 2012.
[16]
Kevin Lai, Liefeng Bo, Xiaofeng Ren, and Dieter Fox. A large-scale hierarchical multi-view rgbd object dataset. In ICRA , 2011.
[17]
Kevin Lai, Liefeng Bo, Xiaofeng Ren, and Dieter Fox. Sparse distance learning for object recognition combining rgb and depth information. In ICRA , 2011.
[18]
Danial Lashkari and Polina Golland. Convex clustering with exemplar-based models. In NIPS , 2007.
[19]
Carolina R.C., Roberto J. Lopez-Sastre, Javier Acevedo-Rodriguez, and Saturnino Maldonado-Bascon. Surfing the point clouds: selective 3d spatial pyramids for category-level object recognition. In CVPR , 2012.
[20]
Max Schwarz, Hannes Schulz, and Sven Behnke. Rgb-d object recognition and pose estimation based on pre-trained convolutional neural network features. In ICRA , 2015.
[21]
Richard Socher, Brody Huval, Bharath Bath, Christopher D Manning, and Andrew Ng. Convolutional-recursive deep learning for 3d object classification. In NIPS , 2012.
[22]
Anran Wang, Jianfei Cai, Jiwen Lu, and Tat-Jen Cham. Mmss: Multi-modal sharable and specific feature learning for rgb-d object recognition. In ICCV , 2015.
[23]
Jason Weston, Frédéric Ratle, Hossein Mobahi, and Ronan Collobert. Deep learning via semisupervised embedding. In Neural Networks: Tricks of the Trade , pages 639-655. Springer, 2012.
[24]
Xiaojin Zhu. Semi-supervised learning literature survey. Technical Report 1530, Computer Sciences, University of Wisconsin-Madison, 2005.

Cited By

View all
  1. Semi-supervised multimodal deep learning for RGB-D object recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    IJCAI'16: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence
    July 2016
    4277 pages
    ISBN:9781577357704

    Sponsors

    • Sony: Sony Corporation
    • Arizona State University: Arizona State University
    • Microsoft: Microsoft
    • Facebook: Facebook
    • AI Journal: AI Journal

    Publisher

    AAAI Press

    Publication History

    Published: 09 July 2016

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media