research-article

Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot Learning

Authors:

Heng Tao ShenAuthors Info & Claims

MM '18: Proceedings of the 26th ACM international conference on Multimedia

Pages 1802 - 1810

https://doi.org/10.1145/3240508.3240715

Published: 15 October 2018 Publication History

Abstract

Zero-shot learning (ZSL) aims to recognize unseen classes that are excluded from training classes. ZSL suffers from 1) Zero-shot bias (Z-Bias) --- model is biased towards seen classes because unseen data is inaccessible for training; 2) Zero-shot variance (Z-Variance) --- associating different images to same semantic embedding yields large associating error. To reduce Z-Bias, we propose a pseudo transfer mechanism, where we first synthesize the distribution of unseen data using semantic embeddings, then we minimize the mismatch between the seen distribution and the synthesized unseen distribution. To reduce Z-Variance, we implicitly corrupted one semantic embedding multiple times to generate image-wise semantic vectors, with which our model learn robust classifiers. Lastly, we integrate our Z-Bias and Z-variance reduction techniques with a linear ZSL model to show its usefulness. Our proposed model successfully overcomes the Z-bias and Z-variance problems. Extensive experiments on five benchmark datasets including ImageNet-1K demonstrate that our model outperforms the state-of-the-art methods with fast training.

References

[1]

Zeynep Akata, Florent Perronnin, Za"id Harchaoui, and Cordelia Schmid. 2016. Label-Embedding for Image Classification. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 38, 7 (2016), 1425--1438.

[2]

Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. 2015. Evaluation of output embeddings for fine-grained image classification. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2927--2936.

[3]

Maxime Bucher, Stéphane Herbin, and Frédéric Jurie. 2017. Generating visual representations for zero-shot classification. In International Conference on Computer Vision (ICCV) Workshops: TASK-CV: Transferring and Adapting Source Knowledge in Computer Vision .

[4]

Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, and Fei Sha. 2016. Synthesized Classifiers for Zero-Shot Learning . In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR. IEEE, 5327--5336.

[5]

Wei-Lun Chao, Soravit Changpinyo, Boqing Gong, and Fei Sha. 2016. An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild . In Computer Vision textendash ECCV 2016. Springer, Cham, Cham, 52--68.

[6]

Minmin Chen, Kilian Q Weinberger, Zhixiang Eddie Xu, and Fei Sha. 2015. Marginalizing stacked linear denoising autoencoders. Journal of Machine Learning Research (2015).

Digital Library

[7]

A Farhadi, I Endres, D Hoiem, and D Forsyth. 2009. Describing objects by their attributes. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops) . IEEE, 1778--1785.

[8]

Andrea Frome, Gregory S. Corrado, Jonathon Shlens, Samy Bengio, Jeffrey Dean, Marc'Aurelio Ranzato, and Tomas Mikolov. 2013. DeViSE: A Deep Visual-Semantic Embedding Model. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5--8, 2013, Lake Tahoe, Nevada, United States. 2121--2129.

Digital Library

[9]

Yanwei Fu, Timothy M Hospedales, Tao Xiang, Zhen-Yong Fu, and Shaogang Gong. 2014. Transductive Multi-view Embedding for Zero-Shot Recognition and Annotation. ECCV, Vol. 8690, Chapter 38 (2014), 584--599.

[10]

Yanwei Fu, Timothy M Hospedales, Tao Xiang, and Shaogang Gong. 2015a. Transductive Multi-View Zero-Shot Learning . IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, 11 (2015), 2332--2345.

Digital Library

[11]

Zhenyong Fu, Tao Xiang, Elyor Kodirov, and Shaogang Gong. 2015b. Zero-shot object recognition by semantic manifold distance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2635--2644.

[12]

Yunchao Gong, Qifa Ke, Michael Isard, and Svetlana Lazebnik. 2014. A multi-view embedding space for modeling internet images, tags, and their semantics. International journal of computer vision, Vol. 106, 2 (2014), 210--233.

Digital Library

[13]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.

Digital Library

[14]

Yuchen Guo, Guiguang Ding, Jungong Han, and Yue Gao. 2017. Synthesizing Samples for Zero-shot Learning. IJCAI (2017), 1774--1780.

Digital Library

[15]

Yuchen Guo, Guiguang Ding, Jungong Han, and Sheng Tang. 2018. Zero-shot Learning with Attribute Selection. AAAI18, Vol. 20, 40 (2018), 60.

[16]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[17]

Chen Huang, Chen Change Loy, and Xiaoou Tang. 2016. Local similarity-aware deep feature embedding. In Advances in Neural Information Processing Systems. 1262--1270.

Digital Library

[18]

Dinesh Jayaraman and Kristen Grauman. 2014. Zero-shot recognition with unreliable attributes. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8--13 2014, Montreal, Quebec, Canada. 3464--3472.

Digital Library

[19]

Elyor Kodirov, Tao Xiang, Zhenyong Fu, and Shaogang Gong. 2015. Unsupervised domain adaptation for zero-shot learning. In Proceedings of the IEEE International Conference on Computer Vision. 2452--2460.

Digital Library

[20]

E. Kodirov, T. Xiang, and S. Gong. 2017. Semantic Autoencoder for Zero-Shot Learning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4447--4456.

[21]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

Digital Library

[22]

Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. 2014. Attribute-Based Classification for Zero-Shot Visual Object Categorization . IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, 3 (2014), 453--465.

Digital Library

[23]

Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International Conference on Machine Learning. 1188--1196.

Digital Library

[24]

Teng Long, Xing Xu, Fumin Shen, Li Liu, Ning Xie, and Yang Yang. 2017b. Zero-shot learning via discriminative representation extraction. Pattern Recognition Letters (2017).

[25]

Yang Long, Li Liu, Ling Shao, Fumin Shen, Guiguang Ding, and Jungong Han. 2017a. From Zero-shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis. In Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1627--1636.

[26]

Laurens Maaten, Minmin Chen, Stephen Tyree, and Kilian Weinberger. 2013. Learning with Marginalized Corrupted Features. In Proceedings of the 30th International Conference on Machine Learning (Proceedings of Machine Learning Research), Sanjoy Dasgupta and David McAllester (Eds.), Vol. 28. PMLR, Atlanta, Georgia, USA, 410--418. http://proceedings.mlr.press/v28/vandermaaten13.html

Digital Library

[27]

Thomas Mensink, Jakob Verbeek, Florent Perronnin, and Gabriela Csurka. 2012. Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In Computer Vision--ECCV 2012 . Springer, 488--501.

[28]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.

Digital Library

[29]

Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg Corrado, and Jeffrey Dean. 2014. Zero-Shot Learning by Convex Combination of Semantic Embeddings. In International Conference on Learning Representations (ICLR) .

[30]

Genevieve Patterson, Chen Xu, Hang Su, and James Hays. 2014. The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding . International Journal of Computer Vision, Vol. 108, 1--2 (2014), 59--81.

Digital Library

[31]

Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) . 1532--1543.

[32]

Ruizhi Qiao, Lingqiao Liu, Chunhua Shen, and Anton van den Hengel. 2016. Less is More: Zero-Shot Learning from Online Textual Documents with Noise Suppression. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR . IEEE, 2249--2257.

[33]

Marcus Rohrbach, Sandra Ebert, and Bernt Schiele. 2013. Transfer learning in a transductive setting. In Advances in neural information processing systems. 46--54.

Digital Library

[34]

Marcus Rohrbach, Michael Stark, and Bernt Schiele. 2011. Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 1641--1648.

Digital Library

[35]

Bernardino Romera-Paredes and Philip Torr. 2015. An embarrassingly simple approach to zero-shot learning . In International Conference on Machine Learning. 2152--2161.

Digital Library

[36]

Richard Socher, Milind Ganjoo, Christopher D Manning, and Andrew Ng. 2013. Zero-Shot Learning Through Cross-Modal Transfer. In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 935--943.

Digital Library

[37]

Jingkuan Song, Yuyu Guo, Lianli Gao, Xuelong Li, Alan Hanjalic, and Heng Tao Shen. 2017. From Deterministic to Generative: Multi-Modal Stochastic RNNs for Video Captioning. arXiv preprint arXiv:1708.02478 (2017).

[38]

C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The Caltech-UCSD Birds-200--2011 Dataset. Technical Report CNS-TR-2011-001. California Institute of Technology.

[39]

Peng Wang, Lingqiao Liu, Chunhua Shen, Zi Huang, Anton van den Hengel, and Heng Tao Shen. 2017. Multi-attention network for one shot learning. In 2017 IEEE conference on computer vision and pattern recognition, CVPR. 22--25.

[40]

Yongqin Xian, Zeynep Akata, Gaurav Sharma, Quynh Nguyen, Matthias Hein, and Bernt Schiele. 2016. Latent Embeddings for Zero-Shot Classification . In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR. IEEE, 69--77.

[41]

Yongqin Xian, Bernt Schiele, and Zeynep Akata. 2017. Zero-shot learning -- The Good, the Bad and the Ugly. In 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017) .

[42]

Xing Xu, Fumin Shen, Yang Yang 0002, Dongxiang Zhang, Heng Tao Shen, and Jingkuan Song. 2017a. Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning. CVPR (2017), 2007--2016.

[43]

Xing Xu, Fumin Shen, Yang Yang, Jie Shao, and Zi Huang. 2017b. Transductive Visual-Semantic Embedding for Zero-shot Learning. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval. ACM, 41--49.

Digital Library

[44]

Xing Xu, Fumin Shen, Yang Yang, Heng Tao Shen, and Xuelong Li. 2017c. Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Transactions on Image Processing, Vol. 26, 5 (2017), 2494--2507.

Digital Library

[45]

Yang Yang, Yi Yang, and Heng Tao Shen. 2013. Effective transfer tagging from image to video. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 9, 2 (2013), 14.

Digital Library

[46]

Li Zhang, Tao Xiang, and Shaogang Gong. 2017. Learning a Deep Embedding Model for Zero-Shot Learning. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 3010--3019.

[47]

Ziming Zhang and Venkatesh Saligrama. 2015. Zero-Shot Learning via Semantic Similarity Embedding. In ICCV .

Digital Library

[48]

Ziming Zhang and Venkatesh Saligrama. 2016. Zero-Shot Recognition via Structured Prediction . In Computer Vision textendash ECCV 2016. Springer, Cham, Cham, 533--548.

Cited By

Gu ZZhou SNiu LZhao ZZhang L(2023)From Pixel to Patch: Synthesize Context-Aware Features for Zero-Shot Semantic SegmentationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.314596234:10(7689-7703)Online publication date: Oct-2023
https://doi.org/10.1109/TNNLS.2022.3145962
Liu ZGuo SLu XGuo JZhang JZeng YHuo F(2023)(ML)2P-Encoder: On Exploration of Channel-Class Correlation for Multi-Label Zero-Shot Learning2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.02285(23859-23868)Online publication date: Jun-2023
https://doi.org/10.1109/CVPR52729.2023.02285
Fan LChen XChai YLin W(2023)Attribute fusion transfer for zero-shot fault diagnosisAdvanced Engineering Informatics10.1016/j.aei.2023.10220458(102204)Online publication date: Oct-2023
https://doi.org/10.1016/j.aei.2023.102204
Show More Cited By

Index Terms

Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot Learning
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Clustering and classification
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Unsupervised learning and clustering

Recommendations

Consistency-guided pseudo labeling for transductive zero-shot learning
Abstract
Zero-shot learning (ZSL) aims to recognize unseen classes during training. Transductive methods have advanced in ZSL, however, often rely on pseudo labels based on confidence scores, leading to semantic misalignment between unseen-class image ...
Attribute subspaces for zero-shot learning
Abstract
Zero-shot learning (ZSL) aims to recognize unseen categories without corresponding training samples, which is a practical yet challenging task in computer vision and pattern recognition community. Current state-of-the-art locality-based ZSL ...
Highlights
- We propose a novel attribute subspace method for discriminative attribute representation learning. AS-ZSL is the first work to introduce subspace representation learning to investigate attribute composition for the ZSL task.
- We design ...
Zero-Shot Learning with Noisy Labels
Abstract
Zero-shot learning (ZSL) is an attractive technique that can recognize novel object classes without any visual examples, but most existing methods assume that the class labels of the training instances from seen classes are accurate and reliable. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '18: Proceedings of the 26th ACM international conference on Multimedia

October 2018

2167 pages

ISBN:9781450356657

DOI:10.1145/3240508

General Chairs:
Susanne Boll
University of Oldenburg, Germany
,
Kyoung Mu Lee
Seoul National University, Korea
,
Jiebo Luo
University of Rochester, USA
,
Wenwu Zhu
Tsinghua University, China
,
Program Chairs:
Hyeran Byun
Yonsei University, Korea
,
Chang Wen Chen
State Univ. Of New York at Buffalo, USA
,
Rainer Lienhart
University of Augsburg, Germany
,
Tao Mei
JD AI, China

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

MM '18

Sponsor:

SIGMM

MM '18: ACM Multimedia Conference

October 22 - 26, 2018

Seoul, Republic of Korea

Acceptance Rates

MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
343
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)1

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gu ZZhou SNiu LZhao ZZhang L(2023)From Pixel to Patch: Synthesize Context-Aware Features for Zero-Shot Semantic SegmentationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.314596234:10(7689-7703)Online publication date: Oct-2023
https://doi.org/10.1109/TNNLS.2022.3145962
Liu ZGuo SLu XGuo JZhang JZeng YHuo F(2023)(ML)2P-Encoder: On Exploration of Channel-Class Correlation for Multi-Label Zero-Shot Learning2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.02285(23859-23868)Online publication date: Jun-2023
https://doi.org/10.1109/CVPR52729.2023.02285
Fan LChen XChai YLin W(2023)Attribute fusion transfer for zero-shot fault diagnosisAdvanced Engineering Informatics10.1016/j.aei.2023.10220458(102204)Online publication date: Oct-2023
https://doi.org/10.1016/j.aei.2023.102204
Wang KWang YXu XLiu XOu WLu HMagalhães Jdel Bimbo ASatoh SSebe NAlameda-Pineda XJin QOria VToni L(2022)Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image RetrievalProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548382(601-609)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3503161.3548382
Pourpanah FAbdar MLuo YZhou XWang RLim CWang XWu Q(2022)A Review of Generalized Zero-Shot Learning MethodsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.3191696(1-20)Online publication date: 2022
https://doi.org/10.1109/TPAMI.2022.3191696
Zhang ZYang G(2022)Exploring Attribute Space with Word Embedding for Zero-shot Learning2022 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN55064.2022.9892132(1-8)Online publication date: 18-Jul-2022
https://doi.org/10.1109/IJCNN55064.2022.9892132
Bai HZhang HWang Q(2021)Dual discriminative auto-encoder network for zero shot image recognitionJournal of Intelligent & Fuzzy Systems10.3233/JIFS-201920(1-12)Online publication date: 11-Jan-2021
https://doi.org/10.3233/JIFS-201920
Tian JXu XWang ZShen FLiu XShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Relationship-Preserving Knowledge Distillation for Zero-Shot Sketch Based Image RetrievalProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475676(5473-5481)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475676
Wang XLi QGong PCheng Y(2021)Zero-Shot Learning Based on Multitask Extended Attribute GroupsIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2019.291220651:3(2003-2011)Online publication date: Mar-2021
https://doi.org/10.1109/TSMC.2019.2912206
Min SYao HXie HZha ZZhang Y(2021)Domain-Oriented Semantic Embedding for Zero-Shot LearningIEEE Transactions on Multimedia10.1109/TMM.2020.303312423(3919-3930)Online publication date: 2021
https://doi.org/10.1109/TMM.2020.3033124
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten