skip to main content
10.5555/1896300.1896315guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Learning methods for generic object recognition with invariance to pose and lighting

Published: 27 June 2004 Publication History

Abstract

We assess the applicability of several popular learning methods for the problem of recognizing generic visual categories with invariance to pose, lighting, and surrounding clutter. A large dataset comprising stereo image pairs of 50 uniform-colored toys under 36 azimuths, 9 elevations, and 6 lighting conditions was collected (for a total of 194,400 individual images). The objects were 10 instances of 5 generic categories: four-legged animals, human figures, airplanes, trucks, and cars. Five instances of each category were used for training, and the other five for testing. Low-resolution grayscale images of the objects with various amounts of variability and surrounding clutter were used for training and testing. Nearest Neighbor methods, Support Vector Machines, and Convolutional Networks, operating on raw pixels or on PCA-derived features were tested. Test error rates for unseen object instances placed on uniform backgrounds were around 13% for SVM and 7% for Convolutional Nets. On a segmentation/recognition task with highly cluttered images, SVM proved impractical, while Convolutional nets yielded 16/7% error. A real-time version of the system was implemented that can detect and classify objects in natural scenes at around 10 frames per second.

References

[1]
S. Belongie, J. Malik, and J. Puzicha. Matching shapes. In Proc. of ICCV, IEEE, 2001.
[2]
L. Bottou, Y. Bengio Convergence Properties of the K-Means Algorithm NIPS 7, MIT Press, 1995.
[3]
L. Bottou, Y. LeCun Lush Reference Manual http://lush.sf.net.
[4]
O. Carmichael, M. Hebert Object Recognition by a Cascade of Edge Probes. Proc.e British Mach. Vision Conf., 2002.
[5]
O. Chapelle, P. Haffner, and V. Vapnik, SVMs for Histogram-Based Image Classification, IEEE Trans. Neural Networks, 1999.
[6]
R. Collobert, S. Bengio, and J. Mariethoz Torch: a modular machine learning software library. Technical Report IDIAPRR 02-46, IDIAP, 2002.
[7]
Y. LeCun, P. Haffner, L. Bottou, and Y. Bengio. Gradient-Based Learning Applied to Document Recognition Proc. IEEE, Nov 1998.
[8]
J. Malik, S. Belongie, T. Leung, and J. Shi Contour and Texture Analysis for Image Segmentation. Int. J. of Comp. Vision, 2001.
[9]
B. Mel SEEMORE:Combining color, shape, and texture histogramming in a neuraly-inspired approach to visual object recognition. Neural Computation, 9:777-804, 1997.
[10]
B. Moghaddam, A. Pentland. Probabilistic Visual Learning for Object Detection. ICCV, IEEE, June 1995.
[11]
H. Murase and S. Nayar. Visual learning and recognition of 3D objects from appearance. Int. J. of Comp. Vision, 14(1):5- 24, 1995.
[12]
E. Osuna, R. Freund, F. Girosi. Training Support Vector Machines: an Application to Face Detection. Proc. of CVPR, Puerto Rico. IEEE, 1997.
[13]
M. Partridge, R. Calvo. "Fast Dimensionality Reduction and Simple PCA," Intelligent Data Analysis Vol. 2, No. 3.
[14]
J. Ponce, M. Cepeda, S. Pae, S. Sullivan. "Shape models and object recognition." In D.A. Forsyth et al., editor, Shape, Contour and Grouping in Computer Vision. Springer, 1999.
[15]
M. Pontil, A. Verri. "Support Vector Machines for 3-D Object Recognition," IEEE Trans. Patt. Anal. Machine Intell. Vol. 20, 637-646, 1998.
[16]
S. Agarwal, and D. Roth "Learning a Sparse Representation for Object Detection" ECCV'02, May 2002.
[17]
S. Roweis personal communication, 2003.
[18]
H.A. Rowley, S. Baluja, T. Kanade. Neural networkbased face detection. IEEE Trans. Patt. Anal. Mach. Intell., 20(1):23-38, January 1998.
[19]
B. Leibe, and B. Schiele. "Analyzing Appearance and Contour Based Methods for Object Categorization.", CVPR, IEEE, 2003.
[20]
C. Schmid and R. Mohr. Local grayvalue invariants for image retrieval. IEEE Trans. Patt. Anal. Mach. Intell., 19(5):530-535, May 1997.
[21]
H. Schneiderman and T. Kanade. A statistical method for 3d object detection applied to faces and cars. In CVPR, IEEE, 2000.
[22]
A. Selinger, R. Nelson. "Appearance-Based Object Recognition Using Multiple Views," CVPR, IEEE, 2001.
[23]
S. Ullman, M. Vidal-Naquet, and E. Sali. "Visual features of intermediate complexity and their use in classification", Nature Neuroscience, 5(7), 2002.
[24]
R. Vaillant, C. Monrocq, and Y. LeCun. Original approach for the localisation of objects in images. IEE Proc. on Vision, Image, and Signal Proc., 141(4):245-250, August 1994.
[25]
P. Viola, M. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. CVPR, IEEE, 2001.
[26]
M. Weber, M. Welling, and P. Perona. Towards automatic discovery of object categories. In CVPR, IEEE 2000.

Cited By

View all
  1. Learning methods for generic object recognition with invariance to pose and lighting

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      CVPR'04: Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
      June 2004
      1041 pages

      Sponsors

      • IEEE-CS\DATC: IEEE Computer Society

      Publisher

      IEEE Computer Society

      United States

      Publication History

      Published: 27 June 2004

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 07 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media