skip to main content
10.1145/3581783.3611906acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Semi-supervised Semantic Segmentation with Mutual Knowledge Distillation

Published: 27 October 2023 Publication History

Abstract

Consistency regularization has been widely studied in recent semi- supervised semantic segmentation methods, and promising per- formance has been achieved. In this work, we propose a new con- sistency regularization framework, termed mutual knowledge dis- tillation (MKD), combined with data and feature augmentation. We introduce two auxiliary mean-teacher models based on consis- tency regularization. More specifically, we use the pseudo-labels generated by a mean teacher to supervise the student network to achieve a mutual knowledge distillation between the two branches. In addition to using image-level strong and weak augmentation, we also discuss feature augmentation. This involves considering various sources of knowledge to distill the student network. Thus, we can significantly increase the diversity of the training samples. Experiments on public benchmarks show that our framework out- performs previous state-of-the-art (SOTA) methods under various semi-supervised settings. Code is available at https://github.com/jianlong-yuan/semi-mmseg.

Supplemental Material

MP4 File
Presentation video

References

[1]
David Berthelot, Nicholas Carlini, Ekin D Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, and Colin Raffel. 2019a. Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2019).
[2]
David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin A Raffel. 2019b. Mixmatch: A holistic approach to semi-supervised learning. Proc. Advances in Neural Inf. Process. Syst., Vol. 32 (2019).
[3]
Liang-Chieh Chen, Raphael Gontijo Lopes, Bowen Cheng, Maxwell D Collins, Ekin D Cubuk, Barret Zoph, Hartwig Adam, and Jonathon Shlens. 2020. Naive-student: Leveraging semi-supervised learning in video sequences for urban scene segmentation. In Proc. Eur. Conf. Comp. Vis. Springer, 695--714.
[4]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proc. Eur. Conf. Comp. Vis. 801--818.
[5]
Xiaokang Chen, Yuhui Yuan, Gang Zeng, and Jingdong Wang. 2021. Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 2613--2622.
[6]
François Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 1251--1258.
[7]
MMSegmentation Contributors. 2020. MMSegmentation: OpenMMLab Semantic Segmentation Toolbox and Benchmark. https://github.com/open-mmlab/mmsegmentation.
[8]
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 3213--3223.
[9]
Mark Everingham, SM Eslami, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2015. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vision, Vol. 111, 1 (2015), 98--136.
[10]
Zhengyang Feng, Qianyu Zhou, Qiqi Gu, Xin Tan, Guangliang Cheng, Xuequan Lu, Jianping Shi, and Lizhuang Ma. 2022. Dmt: Dynamic mutual training for semi-supervised learning. Pattern Recognition (2022), 108777.
[11]
Geoff French, Timo Aila, Samuli Laine, Michal Mackiewicz, and Graham Finlayson. 2019. Semi-supervised semantic segmentation needs strong, high-dimensional perturbations. Proc. British Machine Vis. Conf. (2019).
[12]
Yixiao Ge, Dapeng Chen, and Hongsheng Li. 2020. Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. Proc. Int. Conf. Learn. Representations (2020).
[13]
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. 2020. Bootstrap your own latent-a new approach to self-supervised learning. Proc. Advances in Neural Inf. Process. Syst., Vol. 33 (2020), 21271--21284.
[14]
Bharath Hariharan, Pablo Arbeláez, Lubomir Bourdev, Subhransu Maji, and Jitendra Malik. 2011. Semantic contours from inverse detectors. In Proc. IEEE Int. Conf. Comp. Vis. IEEE, 991--998.
[15]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 9729--9738.
[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 770--778.
[17]
Ruifei He, Jihan Yang, and Xiaojuan Qi. 2021. Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation. In Proc. IEEE Int. Conf. Comp. Vis. 6930--6940.
[18]
Hanzhe Hu, Fangyun Wei, Han Hu, Qiwei Ye, Jinshi Cui, and Liwei Wang. 2021. Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning. Proc. Advances in Neural Inf. Process. Syst., Vol. 34 (2021).
[19]
Wei-Chih Hung, Yi-Hsuan Tsai, Yan-Ting Liou, Yen-Yu Lin, and Ming-Hsuan Yang. 2018. Adversarial learning for semi-supervised semantic segmentation. In Proc. British Machine Vis. Conf.
[20]
Zhanghan Ke, Di Qiu, Kaican Li, Qiong Yan, and Rynson WH Lau. 2020. Guided collaborative training for pixel-wise semi-supervised learning. In Proc. Eur. Conf. Comp. Vis. Springer, 429--445.
[21]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Proc. Advances in Neural Inf. Process. Syst., Vol. 25 (2012).
[22]
Samuli Laine and Timo Aila. 2016. Temporal ensembling for semi-supervised learning. Proc. Int. Conf. Learn. Representations (2016).
[23]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proc. Eur. Conf. Comp. Vis. Springer, 740--755.
[24]
Yuyuan Liu, Yu Tian, Yuanhong Chen, Fengbei Liu, Vasileios Belagiannis, and Gustavo Carneiro. 2021. Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation. arXiv preprint arXiv:2111.12903 (2021).
[25]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 3431--3440.
[26]
Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, and Shin Ishii. 2018. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 41, 8 (2018), 1979--1993.
[27]
Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, and Alan Yuille. 2014. The Role of Context for Object Detection and Semantic Segmentation in the Wild. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn.
[28]
Yassine Ouali, Céline Hudelot, and Myriam Tami. 2020. Semi-supervised semantic segmentation with cross-consistency training. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 12674--12684.
[29]
Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin A Raffel, Ekin Dogus Cubuk, Alexey Kurakin, and Chun-Liang Li. 2020. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Proc. Advances in Neural Inf. Process. Syst., Vol. 33 (2020), 596--608.
[30]
Nasim Souly, Concetto Spampinato, and Mubarak Shah. 2017. Semi supervised semantic segmentation using generative adversarial network. In Proc. IEEE Int. Conf. Comp. Vis. 5688--5696.
[31]
Antti Tarvainen and Harri Valpola. 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Proc. Advances in Neural Inf. Process. Syst., Vol. 30 (2017).
[32]
Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, et al. 2020. Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 43, 10 (2020), 3349--3364.
[33]
Yulin Wang, Gao Huang, Shiji Song, Xuran Pan, Yitong Xia, and Cheng Wu. 2021. Regularizing deep networks with semantic data augmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3052951
[34]
Yulin Wang, Xuran Pan, Shiji Song, Hong Zhang, Gao Huang, and Cheng Wu. 2019. Implicit Semantic Data Augmentation for Deep Networks. In Proc. Advances in Neural Inf. Process. Syst. 12635--12644.
[35]
Yuchao Wang, Haochen Wang, Yujun Shen, Jingjing Fei, Wei Li, Guoqiang Jin, Liwei Wu, Rui Zhao, and Xinyi Le. 2022. Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels. arXiv preprint arXiv:2203.03884 (2022).
[36]
Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. Proc. Advances in Neural Inf. Process. Syst., Vol. 34 (2021).
[37]
Qizhe Xie, Zihang Dai, Eduard Hovy, Thang Luong, and Quoc Le. 2020a. Unsupervised data augmentation for consistency training. Proc. Advances in Neural Inf. Process. Syst., Vol. 33 (2020), 6256--6268.
[38]
Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V Le. 2020b. Self-training with noisy student improves imagenet classification. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 10687--10698.
[39]
Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, and Yang Gao. 2021. ST: Make Self-training Work Better for Semi-supervised Semantic Segmentation. arXiv preprint arXiv:2106.05095 (2021).
[40]
Jianlong Yuan, Zelu Deng, Shu Wang, and Zhenbo Luo. 2020. Multi receptive field network for semantic segmentation. In Proc. Winter Conf. on Appl. of Comp. Vis. IEEE, 1883--1892.
[41]
Jianlong Yuan, Yifan Liu, Chunhua Shen, Zhibin Wang, and Hao Li. 2021. A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation. In Proc. IEEE Int. Conf. Comp. Vis. 8229--8238.
[42]
Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. 2019. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proc. IEEE Int. Conf. Comp. Vis. 6023--6032.
[43]
Pan Zhang, Bo Zhang, Ting Zhang, Dong Chen, and Fang Wen. 2021b. Robust mutual learning for semi-supervised semantic segmentation. arXiv preprint arXiv:2106.00609 (2021).
[44]
Wenwei Zhang, Jiangmiao Pang, Kai Chen, and Chen Change Loy. 2021a. K-Net: Towards Unified Image Segmentation. In Proc. Advances in Neural Inf. Process. Syst.
[45]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 2881--2890.
[46]
Yuanyi Zhong, Bodi Yuan, Hong Wu, Zhiqiang Yuan, Jian Peng, and Yu-Xiong Wang. 2021. Pixel Contrastive-Consistent Semi-Supervised Semantic Segmentation. In Proc. IEEE Int. Conf. Comp. Vis. 7273--7282.
[47]
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, and Antonio Torralba. 2017. Scene parsing through ade20k dataset. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn. 633--641.
[48]
Yuliang Zou, Zizhao Zhang, Han Zhang, Chun-Liang Li, Xiao Bian, Jia-Bin Huang, and Tomas Pfister. 2021. PseudoSeg: Designing Pseudo Labels for Semantic Segmentation. Proc. Int. Conf. Learn. Representations (2021).

Cited By

View all

Index Terms

  1. Semi-supervised Semantic Segmentation with Mutual Knowledge Distillation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '23: Proceedings of the 31st ACM International Conference on Multimedia
    October 2023
    9913 pages
    ISBN:9798400701085
    DOI:10.1145/3581783
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 October 2023

    Check for updates

    Author Tags

    1. consistency regularization
    2. data and network augmentation
    3. mutual knowledge distillation
    4. semi-supervised semantic segmentation

    Qualifiers

    • Research-article

    Conference

    MM '23
    Sponsor:
    MM '23: The 31st ACM International Conference on Multimedia
    October 29 - November 3, 2023
    Ottawa ON, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)118
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 13 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media