research-article

Bag-of-visual-words and spatial extensions for land-use classification

Authors:

Shawn NewsamAuthors Info & Claims

GIS '10: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems

Pages 270 - 279

https://doi.org/10.1145/1869790.1869829

Published: 02 November 2010 Publication History

Abstract

We investigate bag-of-visual-words (BOVW) approaches to land-use classification in high-resolution overhead imagery. We consider a standard non-spatial representation in which the frequencies but not the locations of quantized image features are used to discriminate between classes analogous to how words are used for text document classification without regard to their order of occurrence. We also consider two spatial extensions, the established spatial pyramid match kernel which considers the absolute spatial arrangement of the image features, as well as a novel method which we term the spatial co-occurrence kernel that considers the relative arrangement. These extensions are motivated by the importance of spatial structure in geographic data.

The methods are evaluated using a large ground truth image dataset of 21 land-use classes. In addition to comparisons with standard approaches, we perform extensive evaluation of different configurations such as the size of the visual dictionaries used to derive the BOVW representations and the scale at which the spatial relationships are considered.

We show that even though BOVW approaches do not necessarily perform better than the best standard approaches overall, they represent a robust alternative that is more effective for certain land-use classes. We also show that extending the BOVW approach with our proposed spatial co-occurrence kernel consistently improves performance.

References

[1]

LIBSVM-a library for support vector machines. http://www.csie.ntu.edu.tw/cjlin/libsvm/.

[2]

Snaptell -- visual product search. http://snaptell.com/.

[3]

H. Bay, T. Tuytelaars, and L. V. Gool. SURF: Speeded up robust features. In European Conference on Computer Vision, 2006.

Digital Library

[4]

S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(4):509--522, 2002.

Digital Library

[5]

W. T. Freeman and E. H. Adelson. The design and use of steerable filters. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13(9):891--906, 1991.

Digital Library

[6]

L. J. V. Gool, T. Moons, and D. Ungureanu. Affine/photometric invariants for planar intensity patterns. In European Conference on Computer Vision, 1996.

Digital Library

[7]

K. Grauman and T. Darrell. The pyramid match kernel: Discriminative classification with sets of image features. In IEEE International Conference on Computer Vision, 2005.

Digital Library

[8]

R. M. Haralick, K. Shanmugam, and I. Dinstein. Texture features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3:610--621, 1973.

[9]

C. Harris and M. Stephens. A combined corner and edge detector. In Proceedings of The Fourth Alvey Vision Conference, 1988.

[10]

T. Kadir, A. Zisserman, and M. Brady. An affine invariant salient region detector. In European Conference on Computer Vision, 2004.

[11]

Y. Ke and R. Sukthankar. PCA-SIFT: a more distinctive representation for local image descriptors. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.

Digital Library

[12]

S. Lazebnik, C. Schmid, and J. Ponce. Sparse texture representation using affine-invariant neighborhoods. In IEEE International Conference on Computer Vision and Pattern Recognition, 2003.

[13]

S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In IEEE International Conference on Computer Vision and Pattern Recognition, 2006.

Digital Library

[14]

T. Lindeberg. Feature detection with automatic scale selection. International Journal of Computer Vision, 30(2):79--116, 1998.

Digital Library

[15]

D. G. Lowe. Object recognition from local scale-invariant features. In IEEE International Conference on Computer Vision, 1999.

Digital Library

[16]

D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.

Digital Library

[17]

B. S. Manjunath, P. Salembier, and T. Sikora, editors. Introduction to MPEG7: Multimedia Content Description Interface. John Wiley & Sons, 2002.

Digital Library

[18]

J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide baseline stereo from maximally stable extremal regions. In British Machine Vision Conference, 2002.

[19]

K. Mikolajczyk and C. Schmid. Scale & affine invariant interest point detectors. International Journal of Computer Vision, 60(1):63--86, 2004.

Digital Library

[20]

K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Trans. on Pattern Analysis and Machine Intelligence, 27(10):1615--1630, 2005.

Digital Library

[21]

S. Newsam, L. Wang, S. Bhagavathy, and B. S. Manjunath. Using texture to analyze and manage large collections of remote sensed image and video data. Journal of Applied Optics: Information Processing, 43(2):210--217, 2004.

[22]

D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In IEEE International Conference on Computer Vision and Pattern Recognition, 2006.

Digital Library

[23]

J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In IEEE International Conference on Computer Vision, 2003.

Digital Library

[24]

W. Tobler. A computer movie simulating urban growth in the Detroit region. Economic Geography, 46(2):234--240, 1970.

Cited By

Kottahachchi Kankanamge Don AKhalil I(2025)Q-SupCon: Quantum-Enhanced Supervised Contrastive Learning Architecture within the Representation Learning FrameworkACM Transactions on Quantum Computing10.1145/36606476:1(1-24)Online publication date: 14-Jan-2025
https://dl.acm.org/doi/10.1145/3660647
Ma JJiang WTang XZhang XLiu FJiao L(2025)Multiscale Sparse Cross-Attention Network for Remote Sensing Scene ClassificationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2025.352558263(1-16)Online publication date: 2025
https://doi.org/10.1109/TGRS.2025.3525582
Zhang WCai MZhang TZhuang YLi JMao X(2025)EarthMarker: A Visual Prompting Multimodal Large Language Model for Remote SensingIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.352350563(1-19)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3523505
Show More Cited By

Index Terms

Bag-of-visual-words and spatial extensions for land-use classification

Recommendations

Decision tree classification of land use land cover for Delhi, India using IRS-P6 AWiFS data

Research highlights We explored the potential of multi-temporal IRS P6 (Resourcesat) Advanced Wide Field Sensor (AWiFS) data for mapping of LULC for Delhi, India. A decision tree classification of seasonal composite (three seasons) temporal data set ...
Land-use land-cover classification analysis of Giba catchment using hyper temporal MODIS NDVI satellite images

Landsat-based land-use land-cover LULC mapping studies were previously conducted in Giba catchment, comprising an area of 4019 km². No attempt has been done to map LULC of this catchment through the analysis of Moderate Resolution Imaging ...
Spatial–temporal land-use/land-cover dynamics and their impacts on surface temperature in Chongming Island of Shanghai, China

Land-use/land-cover LULC changes are occurring at rapid rates on the Chongming Island of Shanghai, China, giving rise to a major concern about environmental impacts. We herein carried out a sound analysis of the LULC dynamics, the conversions among ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

GIS '10: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems

November 2010

566 pages

ISBN:9781450304283

DOI:10.1145/1869790

General Chairs:
Divyakant Agrawal
University of California at Santa Barbara
,
Pusheng Zhang
Microsoft Corporation
,
Program Chairs:
Amr El Abbadi
University of California, Santa Barbara
,
Mohamed Mokbel
University of Minnesota

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSPATIAL: ACM Special Interest Group on Spatial Information

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Division of Information and Intelligent Systems

Conference

GIS '10

Sponsor:

SIGSPATIAL

GIS '10: 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems

November 2 - 5, 2010

California, San Jose

Acceptance Rates

Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,946
Total Citations
View Citations
3,142
Total Downloads

Downloads (Last 12 months)444
Downloads (Last 6 weeks)56

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kottahachchi Kankanamge Don AKhalil I(2025)Q-SupCon: Quantum-Enhanced Supervised Contrastive Learning Architecture within the Representation Learning FrameworkACM Transactions on Quantum Computing10.1145/36606476:1(1-24)Online publication date: 14-Jan-2025
https://dl.acm.org/doi/10.1145/3660647
Ma JJiang WTang XZhang XLiu FJiao L(2025)Multiscale Sparse Cross-Attention Network for Remote Sensing Scene ClassificationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2025.352558263(1-16)Online publication date: 2025
https://doi.org/10.1109/TGRS.2025.3525582
Zhang WCai MZhang TZhuang YLi JMao X(2025)EarthMarker: A Visual Prompting Multimodal Large Language Model for Remote SensingIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.352350563(1-19)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3523505
Zhang CRen ZHou BMeng JLi WJiao L(2025)Interactive Concept Network Enhanced Transformer for Remote Sensing Image CaptioningIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.352330563(1-16)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3523305
Yan LZhang XWang KZhang D(2025)Contour-Enhanced Visual State-Space Model for Remote Sensing Image ClassificationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.352063563(1-14)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3520635
Du RTang XMa JZhang XLiu FJiao L(2025)Semantic-Assisted Feature Integration Network for Multilabel Remote Sensing Scene ClassificationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.351767263(1-15)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3517672
Xu KZhu ZWang WFan CWu BJia Z(2025)Enhancing Remote Sensing Scene Classification With Hy-MSDA: A Hybrid CNN–Transformer for Multisource Domain AdaptationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.351652263(1-15)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3516522
Zhang TCheng PZhao LWang ZKong LLiu CXu GSun X(2025)TeCCo: A Terminal-Cloud Cross-Domain Collaborative Framework for Remote Sensing Image ClassificationIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.351546463(1-20)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3515464
Lin QWang SYe XWang RYang RJiao L(2025)CLIP-Based Grid Features and Masking for Remote Sensing Image CaptioningIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2024.351041418(2631-2642)Online publication date: 2025
https://doi.org/10.1109/JSTARS.2024.3510414
Chen CZhu GChen X(2025)Wetland Scene Segmentation of Remote Sensing Images Based on Lie Group Feature and Graph Cut ModelIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2024.350658418(1345-1361)Online publication date: 2025
https://doi.org/10.1109/JSTARS.2024.3506584
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents