research-article

Learning social tag relevance by neighbor voting

Authors:

Cees G. M. Snoek,

Marcel WorringAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 11, Issue 7

Pages 1310 - 1322

https://doi.org/10.1109/TMM.2009.2030598

Published: 01 November 2009 Publication History

Abstract

Social image analysis and retrieval is important for helping people organize and access the increasing amount of user-tagged multimedia. Since user tagging is known to be uncontrolled, ambiguous, and overly personalized, a fundamental problem is how to interpret the relevance of a user-contributed tag with respect to the visual content the tag is describing. Intuitively, if different persons label visually similar images using the same tags, these tags are likely to reflect objective aspects of the visual content. Starting from this intuition, we propose in this paper a neighbor voting algorithm which accurately and efficiently learns tag relevance by accumulating votes from visual neighbors. Under a set of well-defined and realistic assumptions, we prove that our algorithm is a good tag relevance measurement for both image ranking and tag ranking. Three experiments on 3.5 million Flickr photos demonstrate the general applicability of our algorithm in both social image retrieval and image tag suggestion. Our tag relevance learning algorithm substantially improves upon baselines for all the experiments. The results suggest that the proposed algorithm is promising for real-world applications.

References

[1]

E. Auchard, "Flickr to map the world's latest photo hotspots," Reuters, Nov. 2007. {Online}. Available: http://www.reuters.com/article/technologyNews/idUSHO94233920071119?sp=true.

[2]

M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon, "I tube, you tube, everybody tubes: Analyzing the world's largest user generated content video system," in Proc. ACM SIGCOMM Internet Measurement, 2007, pp. 1-14.

Digital Library

[3]

D. A. Shamma, R. Shaw, P. L. Shafton, and Y. Liu, "Watch what I watch: Using community activity to understand content," in Proc. ACM MIR, 2007, pp. 275-284.

Digital Library

[4]

L. Kennedy, M. Naaman, S. Ahern, R. Nair, and T. Rattenbury, "How Flickr helps us make sense of the world: Context and content in community-contributed media collections," in Proc. ACM Multimedia, 2007, pp. 631-640.

Digital Library

[5]

L. Wu, X.-S. Hua, N. Yu, W.-Y. Ma, and S. Li, "Flickr distance," in Proc. ACM Multimedia, 2008, pp. 31-40.

Digital Library

[6]

B. Sigurbjörnsson and R. van Zwol, "Flickr tag recommendation based on collective knowledge," in Proc. WWW, 2008, pp. 327-336.

Digital Library

[7]

A. Torralba, R. Fergus, and W. T. Freeman, "80 million tiny images: A large data set for nonparametric object and scene recognition," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1958-1970, 2008.

Digital Library

[8]

C. Wang, F. Jing, L. Zhang, and H.-J. Zhang, "Scalable search-based image annotation," Multimedia Syst., vol. 14, no. 4, pp. 205-220, 2008.

Digital Library

[9]

S. A. Golder and B. A. Huberman, "Usage patterns of collaborative tagging systems," Inf. Sci., vol. 32, no. 2, pp. 198-208, 2006.

Digital Library

[10]

K. K. Matusiak, "Towards user-centered indexing in digital image collections," OCLC Syst. Serv., vol. 22, no. 4, pp. 283-298, 2006.

[11]

X. Li, C. G. M. Snoek, and M. Worring, "Learning tag relevance by neighbor voting for social image retrieval," in Proc. ACM MIR, 2008, pp. 180-187.

Digital Library

[12]

K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. M. Blei, and M. I. Jordan, "Matching words and pictures," J. Mach. Learn. Res., vol. 3, no. 6, pp. 1107-1135, 2003.

Digital Library

[13]

E. Chang, G. Kingshy, G. Sychay, and G. Wu, "CBSA: Content-based soft annotation for multimodal image retrieval using Bayes point machines," IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 2, pp. 26-38, 2003.

Digital Library

[14]

J. Li and J. Z.Wang, "Real-time computerized annotation of pictures," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 6, pp. 985-1002, 2008.

Digital Library

[15]

A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, "Content-based image retrieval at the end of the early years," IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1349-1380, 2000.

Digital Library

[16]

Y. Jin, L. Khan, L. Wang, and M. Awad, "Image annotations by combining multiple evidence & Wordnet," in Proc. ACM Multimedia, 2005, pp. 706-715.

Digital Library

[17]

K. Weinberger, M. Slaney, and R. van Zwol, "Resolving tag ambiguity," in Proc. ACM Multimedia, 2008, pp. 111-119.

Digital Library

[18]

C. Cusano, G. Ciocca, and R. Schettini, "Image annotation using SVM," in Proc. SPIE, 2004, pp. 330-338.

[19]

P. Quelhas, F. Monay, J.-M. Odobez, D. Gatica-Perez, and T. Tuytelaars, "A thousand words in a scene," IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 9, pp. 1575-1589, 2007.

Digital Library

[20]

R. Datta, D. Joshi, J. Li, and J. Z. Wang, "Tagging over time: Real-world image annotation by lightweight meta-learning," in Proc. ACM Mutlimedia, 2007, pp. 393-402.

Digital Library

[21]

X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma, "Annotating images by mining image search results," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1919-1932, 2008.

Digital Library

[22]

X. Li, C. G. M. Snoek, and M. Worring, "Annotating images by harnessing worldwide user-tagged photos," in Proc. ICASSP, 2009, pp. 3717-3720.

Digital Library

[23]

D. Grangier and S. Bengio, "A discriminative kernel-based approach to rank images from text queries," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 8, pp. 1371-1384, 2008.

Digital Library

[24]

R. Yan, A. Hauptmann, and R. Jin, "Multimedia search with pseudorelevance feedback," in Proc. CIVR, 2003, pp. 649-654.

Digital Library

[25]

G. Park, Y. Baek, and H.-K. Lee, "Majority based ranking approach in web image retrieval," in Proc. CIVR, 2003, pp. 499-504.

Digital Library

[26]

W. Hsu, L. Kennedy, and S.-F. Chang, "Video search reranking via information bottleneck principle," in Proc. ACM Multimedia, 2006, pp. 35-44.

Digital Library

[27]

R. Fergus, P. Perona, and A. Zisserman, "A visual category filter for Google images," in Proc. ECCV, 2004, pp. 242-256.

[28]

Y. Jing and S. Baluja, "VisualRank: Applying pagerank to large-scale image search," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1877-1890, 2008.

Digital Library

[29]

E. Hörster, R. Lienhart, and M. Slaney, "Image retrieval on large-scale image databases," in Proc. CIVR, 2007, pp. 17-24.

Digital Library

[30]

W.-H. Lin and A. Hauptmann, "Web image retrieval re-ranking with relevance model," in Proc. Web Intelligence, 2003, pp. 242-248.

Digital Library

[31]

T.-S. Chua, S.-Y. Neo, K.-Y. Li, G. Wang, R. Shi, M. Zhao, and H. Xu, "TRECVID 2004 search and feature extraction task by NUS PRIS," in Proc. TRECVID Workshop, 2004.

[32]

G. Begelman, P. Keller, and F. Smadja, "Automated tag clustering: Improving search and exploration in the tag space," in Proc. WWW Collaborative Web Tagging Workshop, 2006.

[33]

A. P. Natsev, A. Haubold, J. Tešic, L. Xie, and R. Yan, "Semantic concept-based query expansion and re-ranking for multimedia retrieval," in Proc. ACM Multimedia, 2007, pp. 991-1000.

Digital Library

[34]

R. Datta, D. Joshi, J. Li, and J. Z. Wang, "Image retrieval: Ideas, influences, and trends of the new age," ACM Comput. Surv., vol. 40, no. 2, pp. 1-60, 2008.

Digital Library

[35]

D. W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization. New York: Wiley, 1992.

[36]

F. F.-H. Nah, "A study on tolerable waiting time: How long are Web users willing towait?," J. Beh. Inf. Technol., vol. 23, no. 3, pp. 153-163, 2004.

[37]

B. Billerbeck and J. Zobel, "Questioning query expansion: An examination of behaviour and parameters," in Proc. Australasian Database Conf., 2004, pp. 69-76.

Digital Library

[38]

R. Datta, W. Ge, J. Li, and J. Z.Wang, "Toward bridging the annotation-retrieval gap in image search," IEEE Multimedia, vol. 14, no. 3, pp. 24-35, 2007.

Digital Library

[39]

A. Hauptmann, R. Yan, W.-H. Lin, M. Christel, and H. Wactlar, "Can high-level concepts fill the semantic gap in video retrieval?," IEEE Trans. Multimedia, vol. 9, no. 5, pp. 958-966, 2007.

Digital Library

[40]

C. G. M. Snoek, B. Huurnink, L. Hollink, M. de Rijke, G. Schreiber, and M. Worring, "Adding semantics to detectors for video retrieval," IEEE Trans. Multimedia, vol. 9, no. 5, pp. 975-986, 2007.

Digital Library

[41]

G. Carneiro, A. B. Chan, P. J. Moreno, and N. Vasconcelos, "Supervised learning of semantic classes for image annotation and retrieval," IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 3, pp. 394-410, 2007.

Digital Library

[42]

K. S. Jones, S. Walker, and S. E. Robertson, "A probabilistic model of information retrieval: Development and comparative experiments--Part 2," J. Inf. Process. Manage., vol. 36, no. 6, pp. 809-840, 2000.

Digital Library

[43]

M. Zhu, Recall, Precision and Average Precision, University of Waterloo, Waterloo, ON, Canada, working Paper 2004-09, 2004, Tech. Rep.

[44]

J. Huang, S. Kumar, M. Mitra, W. Zhu, and R. Zabih, "Image indexing using color correlograms," in Proc. CVPR, 1997, pp. 762-768.

Digital Library

[45]

H. Yu, M. Li, H. Zhang, and J. Feng, "Color texture moment for content-based image retrieval," in Proc. ICIP, 2002, pp. 929-932.

[46]

X. Li, L. Chen, L. Zhang, F. Lin, and W.-Y. Ma, "Image annotation by large-scale content-based image retrieval," in Proc. ACM Multimedia, 2006, pp. 607-610.

Digital Library

Cited By

Chen AWang ZDong CTian KZhao RLiang XKang ZLi XEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)ChinaOpen: A Dataset for Open-world Multimodal LearningProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612156(6432-6440)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612156
Xiao FChen YZhang YGong XGao X(2022)Adaptive image annotation: refining labels according to contents and relationsNeural Computing and Applications10.1007/s00521-021-06866-y34:9(7271-7282)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1007/s00521-021-06866-y
De Falco IDe Pietro GSannino G(2022)Classification of Covid-19 chest X-ray images by means of an interpretable evolutionary rule-based approachNeural Computing and Applications10.1007/s00521-021-06806-w35:22(16061-16071)Online publication date: 8-Jan-2022
https://dl.acm.org/doi/10.1007/s00521-021-06806-w
Show More Cited By

Recommendations

Visually weighted neighbor voting for image tag relevance learning

The presence of non-relevant tags in image folksonomies hampers the effective organization and retrieval of user-contributed images. In this paper, we propose to learn the relevance of user-supplied tags by means of visually weighted neighbor voting, a ...
Social tag relevance learning via ranking-oriented neighbor voting

High quality tags play a critical role in applications involving online multimedia search, such as social image annotation, sharing and browsing. However, user-generated tags in real world are often imprecise and incomplete to describe the image ...
Learning tag relevance by neighbor voting for social image retrieval
MIR '08: Proceedings of the 1st ACM international conference on Multimedia information retrieval

Social image retrieval is important for exploiting the increasing amounts of amateur-tagged multimedia such as Flickr images. Since amateur tagging is known to be uncontrolled, ambiguous, and personalized, a fundamental problem is how to reliably ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 11, Issue 7

November 2009

179 pages

ISSN:1520-9210

Issue’s Table of Contents

Copyright © 2009.

Publisher

IEEE Press

Publication History

Published: 01 November 2009

Revised: 13 April 2009

Received: 05 January 2009

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

149
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen AWang ZDong CTian KZhao RLiang XKang ZLi XEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)ChinaOpen: A Dataset for Open-world Multimodal LearningProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612156(6432-6440)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612156
Xiao FChen YZhang YGong XGao X(2022)Adaptive image annotation: refining labels according to contents and relationsNeural Computing and Applications10.1007/s00521-021-06866-y34:9(7271-7282)Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1007/s00521-021-06866-y
De Falco IDe Pietro GSannino G(2022)Classification of Covid-19 chest X-ray images by means of an interpretable evolutionary rule-based approachNeural Computing and Applications10.1007/s00521-021-06806-w35:22(16061-16071)Online publication date: 8-Jan-2022
https://dl.acm.org/doi/10.1007/s00521-021-06806-w
Landolsi MHaj Mohamed HBen Romdhane L(2021)Image annotation in social networks using graph and multimodal deep learning featuresMultimedia Tools and Applications10.1007/s11042-020-09730-880:8(12009-12034)Online publication date: 1-Mar-2021
https://dl.acm.org/doi/10.1007/s11042-020-09730-8
Chaudhary CGoyal PPrasad DChen Y(2020)Enhancing the Quality of Image Tagging Using a Visio-Textual Knowledge BaseIEEE Transactions on Multimedia10.1109/TMM.2019.293718122:4(897-911)Online publication date: 24-Mar-2020
https://dl.acm.org/doi/10.1109/TMM.2019.2937181
(2019)Multimedia auto-annotation via label correlation miningInternational Journal of High Performance Computing and Networking10.5555/3337625.333763213:4(427-435)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.5555/3337625.3337632
Tian FLiu XLiu ZSun NWang MWang HZhang F(2019)Multimedia integrated annotation based on common space learningMultimedia Tools and Applications10.1007/s11042-017-5068-078:1(437-456)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.1007/s11042-017-5068-0
Zhang JWu QZhang JShen CLu JMcIlraith SWeinberger K(2018)Kill two birds with one stone: weakly-supervised neural network for image annotation and tag refinementProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence10.5555/3504035.3504960(7550-7557)Online publication date: 2-Feb-2018
https://dl.acm.org/doi/10.5555/3504035.3504960
Bhowmick SXiaokui Xiao X(2018)DANTEACM SIGMOD Record10.1145/3299887.329990047:2(67-72)Online publication date: 11-Dec-2018
https://dl.acm.org/doi/10.1145/3299887.3299900
Chew MBhowmick SJatowt ACollins-Thompson KMei QDavison BLiu YYilmaz E(2018)Ranking Without LearningThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210100(1133-1136)Online publication date: 27-Jun-2018
https://dl.acm.org/doi/10.1145/3209978.3210100
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents