skip to main content
research-article

Learning social tag relevance by neighbor voting

Published: 01 November 2009 Publication History

Abstract

Social image analysis and retrieval is important for helping people organize and access the increasing amount of user-tagged multimedia. Since user tagging is known to be uncontrolled, ambiguous, and overly personalized, a fundamental problem is how to interpret the relevance of a user-contributed tag with respect to the visual content the tag is describing. Intuitively, if different persons label visually similar images using the same tags, these tags are likely to reflect objective aspects of the visual content. Starting from this intuition, we propose in this paper a neighbor voting algorithm which accurately and efficiently learns tag relevance by accumulating votes from visual neighbors. Under a set of well-defined and realistic assumptions, we prove that our algorithm is a good tag relevance measurement for both image ranking and tag ranking. Three experiments on 3.5 million Flickr photos demonstrate the general applicability of our algorithm in both social image retrieval and image tag suggestion. Our tag relevance learning algorithm substantially improves upon baselines for all the experiments. The results suggest that the proposed algorithm is promising for real-world applications.

References

[1]
E. Auchard, "Flickr to map the world's latest photo hotspots," Reuters, Nov. 2007. {Online}. Available: http://www.reuters.com/article/technologyNews/idUSHO94233920071119?sp=true.
[2]
M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon, "I tube, you tube, everybody tubes: Analyzing the world's largest user generated content video system," in Proc. ACM SIGCOMM Internet Measurement, 2007, pp. 1-14.
[3]
D. A. Shamma, R. Shaw, P. L. Shafton, and Y. Liu, "Watch what I watch: Using community activity to understand content," in Proc. ACM MIR, 2007, pp. 275-284.
[4]
L. Kennedy, M. Naaman, S. Ahern, R. Nair, and T. Rattenbury, "How Flickr helps us make sense of the world: Context and content in community-contributed media collections," in Proc. ACM Multimedia, 2007, pp. 631-640.
[5]
L. Wu, X.-S. Hua, N. Yu, W.-Y. Ma, and S. Li, "Flickr distance," in Proc. ACM Multimedia, 2008, pp. 31-40.
[6]
B. Sigurbjörnsson and R. van Zwol, "Flickr tag recommendation based on collective knowledge," in Proc. WWW, 2008, pp. 327-336.
[7]
A. Torralba, R. Fergus, and W. T. Freeman, "80 million tiny images: A large data set for nonparametric object and scene recognition," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1958-1970, 2008.
[8]
C. Wang, F. Jing, L. Zhang, and H.-J. Zhang, "Scalable search-based image annotation," Multimedia Syst., vol. 14, no. 4, pp. 205-220, 2008.
[9]
S. A. Golder and B. A. Huberman, "Usage patterns of collaborative tagging systems," Inf. Sci., vol. 32, no. 2, pp. 198-208, 2006.
[10]
K. K. Matusiak, "Towards user-centered indexing in digital image collections," OCLC Syst. Serv., vol. 22, no. 4, pp. 283-298, 2006.
[11]
X. Li, C. G. M. Snoek, and M. Worring, "Learning tag relevance by neighbor voting for social image retrieval," in Proc. ACM MIR, 2008, pp. 180-187.
[12]
K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. M. Blei, and M. I. Jordan, "Matching words and pictures," J. Mach. Learn. Res., vol. 3, no. 6, pp. 1107-1135, 2003.
[13]
E. Chang, G. Kingshy, G. Sychay, and G. Wu, "CBSA: Content-based soft annotation for multimodal image retrieval using Bayes point machines," IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 2, pp. 26-38, 2003.
[14]
J. Li and J. Z.Wang, "Real-time computerized annotation of pictures," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 6, pp. 985-1002, 2008.
[15]
A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, "Content-based image retrieval at the end of the early years," IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 12, pp. 1349-1380, 2000.
[16]
Y. Jin, L. Khan, L. Wang, and M. Awad, "Image annotations by combining multiple evidence & Wordnet," in Proc. ACM Multimedia, 2005, pp. 706-715.
[17]
K. Weinberger, M. Slaney, and R. van Zwol, "Resolving tag ambiguity," in Proc. ACM Multimedia, 2008, pp. 111-119.
[18]
C. Cusano, G. Ciocca, and R. Schettini, "Image annotation using SVM," in Proc. SPIE, 2004, pp. 330-338.
[19]
P. Quelhas, F. Monay, J.-M. Odobez, D. Gatica-Perez, and T. Tuytelaars, "A thousand words in a scene," IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 9, pp. 1575-1589, 2007.
[20]
R. Datta, D. Joshi, J. Li, and J. Z. Wang, "Tagging over time: Real-world image annotation by lightweight meta-learning," in Proc. ACM Mutlimedia, 2007, pp. 393-402.
[21]
X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma, "Annotating images by mining image search results," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1919-1932, 2008.
[22]
X. Li, C. G. M. Snoek, and M. Worring, "Annotating images by harnessing worldwide user-tagged photos," in Proc. ICASSP, 2009, pp. 3717-3720.
[23]
D. Grangier and S. Bengio, "A discriminative kernel-based approach to rank images from text queries," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 8, pp. 1371-1384, 2008.
[24]
R. Yan, A. Hauptmann, and R. Jin, "Multimedia search with pseudorelevance feedback," in Proc. CIVR, 2003, pp. 649-654.
[25]
G. Park, Y. Baek, and H.-K. Lee, "Majority based ranking approach in web image retrieval," in Proc. CIVR, 2003, pp. 499-504.
[26]
W. Hsu, L. Kennedy, and S.-F. Chang, "Video search reranking via information bottleneck principle," in Proc. ACM Multimedia, 2006, pp. 35-44.
[27]
R. Fergus, P. Perona, and A. Zisserman, "A visual category filter for Google images," in Proc. ECCV, 2004, pp. 242-256.
[28]
Y. Jing and S. Baluja, "VisualRank: Applying pagerank to large-scale image search," IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1877-1890, 2008.
[29]
E. Hörster, R. Lienhart, and M. Slaney, "Image retrieval on large-scale image databases," in Proc. CIVR, 2007, pp. 17-24.
[30]
W.-H. Lin and A. Hauptmann, "Web image retrieval re-ranking with relevance model," in Proc. Web Intelligence, 2003, pp. 242-248.
[31]
T.-S. Chua, S.-Y. Neo, K.-Y. Li, G. Wang, R. Shi, M. Zhao, and H. Xu, "TRECVID 2004 search and feature extraction task by NUS PRIS," in Proc. TRECVID Workshop, 2004.
[32]
G. Begelman, P. Keller, and F. Smadja, "Automated tag clustering: Improving search and exploration in the tag space," in Proc. WWW Collaborative Web Tagging Workshop, 2006.
[33]
A. P. Natsev, A. Haubold, J. Tešic, L. Xie, and R. Yan, "Semantic concept-based query expansion and re-ranking for multimedia retrieval," in Proc. ACM Multimedia, 2007, pp. 991-1000.
[34]
R. Datta, D. Joshi, J. Li, and J. Z. Wang, "Image retrieval: Ideas, influences, and trends of the new age," ACM Comput. Surv., vol. 40, no. 2, pp. 1-60, 2008.
[35]
D. W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization. New York: Wiley, 1992.
[36]
F. F.-H. Nah, "A study on tolerable waiting time: How long are Web users willing towait?," J. Beh. Inf. Technol., vol. 23, no. 3, pp. 153-163, 2004.
[37]
B. Billerbeck and J. Zobel, "Questioning query expansion: An examination of behaviour and parameters," in Proc. Australasian Database Conf., 2004, pp. 69-76.
[38]
R. Datta, W. Ge, J. Li, and J. Z.Wang, "Toward bridging the annotation-retrieval gap in image search," IEEE Multimedia, vol. 14, no. 3, pp. 24-35, 2007.
[39]
A. Hauptmann, R. Yan, W.-H. Lin, M. Christel, and H. Wactlar, "Can high-level concepts fill the semantic gap in video retrieval?," IEEE Trans. Multimedia, vol. 9, no. 5, pp. 958-966, 2007.
[40]
C. G. M. Snoek, B. Huurnink, L. Hollink, M. de Rijke, G. Schreiber, and M. Worring, "Adding semantics to detectors for video retrieval," IEEE Trans. Multimedia, vol. 9, no. 5, pp. 975-986, 2007.
[41]
G. Carneiro, A. B. Chan, P. J. Moreno, and N. Vasconcelos, "Supervised learning of semantic classes for image annotation and retrieval," IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 3, pp. 394-410, 2007.
[42]
K. S. Jones, S. Walker, and S. E. Robertson, "A probabilistic model of information retrieval: Development and comparative experiments--Part 2," J. Inf. Process. Manage., vol. 36, no. 6, pp. 809-840, 2000.
[43]
M. Zhu, Recall, Precision and Average Precision, University of Waterloo, Waterloo, ON, Canada, working Paper 2004-09, 2004, Tech. Rep.
[44]
J. Huang, S. Kumar, M. Mitra, W. Zhu, and R. Zabih, "Image indexing using color correlograms," in Proc. CVPR, 1997, pp. 762-768.
[45]
H. Yu, M. Li, H. Zhang, and J. Feng, "Color texture moment for content-based image retrieval," in Proc. ICIP, 2002, pp. 929-932.
[46]
X. Li, L. Chen, L. Zhang, F. Lin, and W.-Y. Ma, "Image annotation by large-scale content-based image retrieval," in Proc. ACM Multimedia, 2006, pp. 607-610.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia
IEEE Transactions on Multimedia  Volume 11, Issue 7
November 2009
179 pages

Publisher

IEEE Press

Publication History

Published: 01 November 2009
Revised: 13 April 2009
Received: 05 January 2009

Author Tags

  1. Multimedia indexing and retrieval
  2. multimedia indexing and retrieval
  3. neighbor voting
  4. social tagging
  5. tag relevance learning

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media