article

Selective Search for Object Recognition

Authors:

J. R. Uijlings,

A. W. SmeuldersAuthors Info & Claims

International Journal of Computer Vision, Volume 104, Issue 2

Pages 154 - 171

https://doi.org/10.1007/s11263-013-0620-5

Published: 01 September 2013 Publication History

Abstract

This paper addresses the problem of generating possible object locations for use in object recognition. We introduce selective search which combines the strength of both an exhaustive search and segmentation. Like segmentation, we use the image structure to guide our sampling process. Like exhaustive search, we aim to capture all possible object locations. Instead of a single technique to generate possible object locations, we diversify our search and use a variety of complementary image partitionings to deal with as many image conditions as possible. Our selective search results in a small set of data-driven, class-independent, high quality locations, yielding 99 % recall and a Mean Average Best Overlap of 0.879 at 10,097 locations. The reduced number of locations compared to an exhaustive search enables the use of stronger machine learning techniques and stronger appearance models for object recognition. In this paper we show that our selective search enables the use of the powerful Bag-of-Words model for recognition. The selective search software is made publicly available (Software: http://disi.unitn.it/~uijlings/SelectiveSearch.html ).

References

[1]

Alexe, B., Deselaers, T., Ferrari, V. (2010). What is an object? In CVPR.

[2]

Alexe, B., Deselaers, T., & Ferrari, V. (2012). Measuring the objectness of image windows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34 (11), 2189-2202.

Digital Library

[3]

Arbeláez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33 (5), 898-916.

Digital Library

[4]

Carreira, J., Sminchisescu, C. (2010). Constrained parametric min-cuts for automatic object segmentation. In CVPR.

[5]

Chum, O., Zisserman, A. (2007). An exemplar model for learning object classes. In CVPR.

[6]

Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 , 603-619.

Digital Library

[7]

Csurka, G., Dance, C. R., Fan, L., Willamowski, J., & Bray, C. (2004). In ECCV statistical learning in computer vision: Visual categorization with bags of keypoints.

[8]

Dalal, N., Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR.

Digital Library

[9]

Endres, I., Hoiem, D. (2010). Category independent object proposals. In ECCV.

Digital Library

[10]

Everingham, M., Gool, L. V., Williams, C., Winn, J., & Zisserman, A. (2011). The Pascal visual object classes challenge workshop: Overview and results of the detection challenge.

[11]

Everingham, M., van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88 , 303-338.

Digital Library

[12]

Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32 , 1627-1645.

Digital Library

[13]

Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59 , 167-181.

Digital Library

[14]

Geusebroek, J. M., van den Boomgaard, R., Smeulders, A. W. M., & Geerts, H. (2001). Color invariance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23 , 1338-1350.

Digital Library

[15]

Gu, C., Lim, J. J., Arbeláez, P., & Malik, J. (2009). In CVPR: Recognition using regions.

[16]

Harzallah, H., Jurie, F., & Schmid, C. (2009). In ICCV: Combining efficient object localization and image classification.

[17]

Lampert, C. H., Blaschko, M. B., & Hofmann, T. (2009). Efficient sub-window search: A branch and bound framework for object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31 , 2129-2142.

Digital Library

[18]

Lazebnik, S., Schmid, C., Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR.

Digital Library

[19]

Li, F., & Carreira, J., Sminchisescu, C. (2010). In CVPR: Object recognition as ranking holistic figure-ground hypotheses.

[20]

Liu, C., Sharan, L., Adelson, E.H., Rosenholtz, R. (2010). Exploring features in a bayesian framework for material recognition. In Computer vision and pattern recognition 2010. IEEE.

[21]

Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60 , 91-110.

Digital Library

[22]

Maji, S., Berg, A. C., & Malik, J. (2008). In CVPR: Classification using intersection kernel support vector machines is efficient.

[23]

Maji, S., & Malik, J. (2009). Object detection using a max-margin hough transform. In CVPR.

[24]

Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24 (7), 971-987.

Digital Library

[25]

Perronnin, F., Sánchez, J., & Thomas M. (2010). In ECCV: Improving the Fisher Kernel for large-scale image classification.

[26]

Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 , 888-905.

Digital Library

[27]

Sivic, J., Zisserman, A.(2003). Video google: A text retrieval approach to object matching in videos. In ICCV.

[28]

Sonnenburg, S., Raetsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., et al. (2010). The shogun machine learning toolbox. Journal of Machine Learning Research, 11 , 1799-1802.

Digital Library

[29]

Tu, Z., Chen, X., Yuille, A. L., & Zhu, S. (2005). Image parsing: Unifying segmentation, detection and recognition. Marr Prize Issue. International Journal of Computer Vision .

[30]

Uijlings, J. R. R., Smeulders, A. W. M., & Scha, R. J. H. (2010). Realtime visual concept classification. IEEE Transactions on Multimedia, 12 (7), 665-681.

Digital Library

[31]

van de Sande, K. E. A., & Gevers, T. (2012). Illumination-invariant descriptors for discriminative visual object categorization. Technical report: University of Amsterdam.

[32]

van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2010). Evaluating color descriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32 , 1582-1596.

Digital Library

[33]

van de Sande, K. E. A., Gevers, T., & Snoek, C. G. M. (2011). Empowering visual categorization with the GPU. IEEE Transactions on Multimedia, 13 (1), 60-70.

Digital Library

[34]

Vedaldi, A., Gulshan, V., Varma, M., & Zisserman, A. (2009). In ICCV: Multiple kernels for object detection.

[35]

Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In CVPR, Volume 1, 511-518.

[36]

Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57 , 137-154.

Digital Library

[37]

Zhou, X., Kai, Y., Zhang, T., & Huang, T. S. (2010). In ECCV: Image classification using super-vector coding of local image descriptors.

[38]

Zhu, L., Chen, Y., Yuille, A., & Freeman, W. (2010). In CVPR: Latent hierarchical structural learning for object detection.

Cited By

Lin THua LLinxuan LChuanao B(2024)Open-world object detectionAI Communications10.3233/AIC-23027037:4(637-653)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.3233/AIC-230270
Altaweel MKhelifi AShana’ah M(2024)Monitoring Looting at Cultural Heritage SitesSocial Science Computer Review10.1177/0894439323118847142:2(480-495)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1177/08944393231188471
Li SZhu JChang BWu HXu FZhong S(2024)Multi-Label and Evolvable Dataset Preparation for Web-Based Object DetectionACM Transactions on Knowledge Discovery from Data10.1145/369546518:9(1-21)Online publication date: 30-Oct-2024
https://dl.acm.org/doi/10.1145/3695465
Show More Cited By

Selective Search for Object Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks

Recommendations

Segmentation as selective search for object recognition
ICCV '11: Proceedings of the 2011 International Conference on Computer Vision

For object recognition, the current state-of-the-art is based on exhaustive search. However, to enable the use of more expensive features and classifiers and thereby progress beyond the state-of-the-art, a selective search strategy is needed. Therefore, ...
A comparison of nearest neighbor search algorithms for generic object recognition
ACIVS'06: Proceedings of the 8th international conference on Advanced Concepts For Intelligent Vision Systems

The nearest neighbor (NN) classifier is well suited for generic object recognition. However, it requires storing the complete training data, and classification time is linear in the amount of data. There are several approaches to improve runtime and/or ...
Robust Selective Search

Selective search is a modern distributed search architecture designed to reduce the computational cost of large-scale search. Selective search creates topical shards that are deliberately content-skewed, placing highly similar documents together in the ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of Computer Vision

International Journal of Computer Vision Volume 104, Issue 2

September 2013

103 pages

ISSN:0920-5691

Issue’s Table of Contents

Copyright © Copyright © 2013 Springer Science+Business Media New York.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 September 2013

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,049
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lin THua LLinxuan LChuanao B(2024)Open-world object detectionAI Communications10.3233/AIC-23027037:4(637-653)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.3233/AIC-230270
Altaweel MKhelifi AShana’ah M(2024)Monitoring Looting at Cultural Heritage SitesSocial Science Computer Review10.1177/0894439323118847142:2(480-495)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1177/08944393231188471
Li SZhu JChang BWu HXu FZhong S(2024)Multi-Label and Evolvable Dataset Preparation for Web-Based Object DetectionACM Transactions on Knowledge Discovery from Data10.1145/369546518:9(1-21)Online publication date: 30-Oct-2024
https://dl.acm.org/doi/10.1145/3695465
Nie C(2024)Research on Fine-grained Bird Diversity Recognition Based on Convolutional Neural NetworksProceedings of the 5th International Conference on Computer Information and Big Data Applications10.1145/3671151.3671256(595-599)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3671151.3671256
Chang JShi YYang LNiu YNie YYang ZLi LZhang W(2024)Deep Learning Method for Leakage Location Detection of Pneumatic Systems Based on Infrared Thermal Image EvaluationProceedings of the 2024 7th International Conference on Software Engineering and Information Management10.1145/3647722.3647734(77-83)Online publication date: 23-Jan-2024
https://dl.acm.org/doi/10.1145/3647722.3647734
Li WZhang JYuan C(2024)Spatial-Channel Specific Snake-YOLOv8 for Video Logo Detection in Live Streaming ScenesProceedings of the 2024 7th International Conference on Image and Graphics Processing10.1145/3647649.3647712(402-408)Online publication date: 19-Jan-2024
https://dl.acm.org/doi/10.1145/3647649.3647712
Xuan HWu ZYang JJiang BLuo LAlameda-Pineda XYan Y(2024)Robust Audio-Visual Contrastive Learning for Proposal-Based Self-Supervised Sound Source Localization in VideosIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336350846:7(4896-4907)Online publication date: 7-Feb-2024
https://dl.acm.org/doi/10.1109/TPAMI.2024.3363508
Zhao BHan PLi X(2024)Vehicle Perception From SatelliteIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333595346:4(2545-2554)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3335953
Ding JXie EXu HJiang CLi ZLuo PXia G(2024)Deeply Unsupervised Patch Re-Identification for Pre-Training Object DetectorsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.316491146:3(1348-1361)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1109/TPAMI.2022.3164911
Wang XWang LYang ZZhou JZheng YChen FHong RYu JYang F(2024)DSIS-DPR:Structured Instance Segmentation and Diffusion Prior Refinement for Dental Anatomy LearningIEEE Transactions on Multimedia10.1109/TMM.2024.339477726(9464-9476)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3394777
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents