research-article

Free access

DGMamba: Domain Generalization via Generalized State Space Model

Authors:

Shuicheng YanAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 3607 - 3616

https://doi.org/10.1145/3664647.3681247

Published: 28 October 2024 Publication History

Abstract

Domain generalization (DG) aims at solving distribution shift problems in various scenes. Existing approaches are based on Convolution Neural Networks (CNNs) or Vision Transformers (ViTs), which suffer from limited receptive fields or quadratic complexity issues. Mamba, as an emerging state space model (SSM), possesses superior linear complexity and global receptive fields. Despite this, it can hardly be applied to DG to address distribution shifts, due to the hidden state issues and inappropriate scan mechanisms. In this paper, we propose a novel framework for DG, named DGMamba, that excels in strong generalizability toward unseen domains and meanwhile has the advantages of global receptive fields, and efficient linear complexity. Our DGMamba compromises two core components: Hidden State Suppressing (HSS) and Semantic-aware Patch Refining (SPR). In particular, HSS is introduced to mitigate the influence of hidden states associated with domain-specific features during output prediction. SPR strives to encourage the model to concentrate more on objects rather than context, consisting of two designs: Prior-Free Scanning (PFS), and Domain Context Interchange (DCI). Concretely, PFS aims to shuffle the non-semantic patches within images, creating more flexible and effective sequences from images, and DCI is designed to regularize Mamba with the combination of mismatched non-semantic and semantic information by fusing patches among domains. Extensive experiments on four commonly used DG benchmarks demonstrate that the proposed DGMamba achieves remarkably superior results to state-of-the-art models. The code will be made publicly available at https://github.com/longshaocong/DGMamba.

References

[1]

Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. 2019. Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019).

[2]

Yogesh Balaji, Swami Sankaranarayanan, and Rama Chellappa. 2018. Metareg: Towards domain generalization using meta-regularization. Advances in Neural Information Processing Systems, Vol. 31 (2018).

[3]

Andrei Barbu, David Mayo, Julian Alverio, William Luo, Christopher Wang, Dan Gutfreund, Josh Tenenbaum, and Boris Katz. 2019. Objectnet: A large-scale bias-controlled dataset for pushing the limits of object recognition models. Advances in Neural Information Processing Systems, Vol. 32 (2019).

[4]

Sara Beery, Grant Van Horn, and Pietro Perona. 2018. Recognition in terra incognita. In Proceedings of the European Conference on Computer Vision. 456--473.

Digital Library

[5]

Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Wortman Vaughan. 2010. A theory of learning from different domains. Machine Learning, Vol. 79, 1 (2010), 151--175.

Digital Library

[6]

Gilles Blanchard, Aniket Anand Deshmukh, Ürun Dogan, Gyemin Lee, and Clayton Scott. 2021. Domain generalization by marginal transfer learning. The Journal of Machine Learning Research, Vol. 22, 1 (2021), 46--100.

Digital Library

[7]

Junbum Cha, Sanghyuk Chun, Kyungjae Lee, Han-Cheol Cho, Seunghyun Park, Yunsung Lee, and Sungrae Park. 2021. Swad: Domain generalization by seeking flat minima. Advances in Neural Information Processing Systems, Vol. 34 (2021), 22405--22418.

[8]

Chaoqi Chen, Luyao Tang, Feng Liu, Gangming Zhao, Yue Huang, and Yizhou Yu. 2022. Mix and reason: Reasoning over semantic topology with data mixing for domain generalization. Advances in Neural Information Processing Systems, Vol. 35 (2022), 33302--33315.

[9]

Xu Chu, Yujie Jin, Wenwu Zhu, Yasha Wang, Xin Wang, Shanghang Zhang, and Hong Mei. 2022. DNA: Domain generalization with diversified neural averaging. In Proceedings of the International Conference on Machine Learning. 4010--4034.

[10]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 248--255.

[11]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2020. An image is worth 16x16 words: Transformers for image recognition at Scale. In International Conference on Learning Representations.

[12]

Qi Dou, Daniel Coelho de Castro, Konstantinos Kamnitsas, and Ben Glocker. 2019. Domain generalization via model-agnostic learning of semantic features. Advances in Neural Information Processing Systems, Vol. 32 (2019).

[13]

Yingjun Du, Jun Xu, Huan Xiong, Qiang Qiu, Xiantong Zhen, Cees GM Snoek, and Ling Shao. 2020. Learning to learn with variational information bottleneck for domain generalization. In European Conference on Computer Vision. 200--216.

Digital Library

[14]

Chen Fang, Ye Xu, and Daniel N Rockmore. 2013. Unbiased metric learning: On the utilization of multiple datasets and web images for softening bias. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1657--1664.

Digital Library

[15]

Tongtong Fang, Nan Lu, Gang Niu, and Masashi Sugiyama. 2020. Rethinking importance weighting for deep learning under distribution shift. Advances in Neural Information Processing Systems, Vol. 33 (2020), 11996--12007.

[16]

Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Franc cois Laviolette, Mario March, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. Journal of Machine Learning Research, Vol. 17, 59 (2016), 1--35.

Digital Library

[17]

Robert Geirhos, Kantharaju Narayanappa, Benjamin Mitzkus, Tizian Thieringer, Matthias Bethge, Felix A Wichmann, and Wieland Brendel. 2021. Partial success in closing the gap between human and machine vision. Advances in Neural Information Processing Systems, Vol. 34 (2021), 23885--23899.

[18]

Albert Gu and Tri Dao. 2023. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023).

[19]

Qiqi Gu, Qianyu Zhou, Minghao Xu, Zhengyang Feng, Guangliang Cheng, Xuequan Lu, Jianping Shi, and Lizhuang Ma. 2021. Pit: Position-invariant transform for cross-fov domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8761--8770.

[20]

Ishaan Gulrajani and David Lopez-Paz. 2020. In search of lost domain generalization. In International Conference on Learning Representations.

[21]

Jintao Guo, Lei Qi, and Yinghuan Shi. 2023. Domaindrop: Suppressing domain-sensitive channels for domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 19114--19124.

[22]

Jintao Guo, Lei Qi, Yinghuan Shi, and Yang Gao. 2024. SETA: Semantic-aware token augmentation for domain generalization. arXiv preprint arXiv:2403.11792 (2024).

[23]

Ankit Gupta, Albert Gu, and Jonathan Berant. 2022. Diagonal state spaces are as effective as structured state spaces. Advances in Neural Information Processing Systems, Vol. 35 (2022), 22982--22994.

[24]

Haoyang He, Yuhu Bai, Jiangning Zhang, Qingdong He, Hongxu Chen, Zhenye Gan, Chengjie Wang, Xiangtai Li, Guanzhong Tian, and Lei Xie. 2024. MambaAD: Exploring state space models for multi-class unsupervised anomaly detection. arXiv (2024).

[25]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 770--778.

[26]

Lu He, Qianyu Zhou, Xiangtai Li, Li Niu, Guangliang Cheng, Xiao Li, Wenxuan Liu, Yunhai Tong, Lizhuang Ma, and Liqing Zhang. 2021. End-to-end video object detection with spatial-temporal transformers. In Proceedings of theACM International Conference on Multimedia. 1507--1516.

Digital Library

[27]

Dan Hendrycks, Kevin Zhao, Steven Basart, Jacob Steinhardt, and Dawn Song. 2021. Natural adversarial examples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15262--15271.

[28]

Zeyi Huang, Haohan Wang, Eric P Xing, and Dong Huang. 2020. Self-challenging improves cross-domain generalization. In Proceedings of the European Conference on Computer Vision. 124--140.

Digital Library

[29]

Zenan Huang, Haobo Wang, Junbo Zhao, and Nenggan Zheng. 2023. iDAG: Invariant DAG searching for domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 19169--19179.

[30]

Seogkyu Jeon, Kibeom Hong, Pilhyeon Lee, Jewook Lee, and Hyeran Byun. 2021. Feature stylization and domain-aware contrastive learning for domain generalization. In Proceedings of the ACM International Conference on Multimedia. 22--31.

Digital Library

[31]

Jincen Jiang, Qianyu Zhou, Yuhang Li, Xuequan Lu, Meili Wang, Lizhuang Ma, Jian Chang, and Jian Jun Zhang. 2024. DG-PIC: Domain Generalized Point-In-Context Learning for Point Cloud Understanding. In European Conference on Computer Vision.

[32]

Daehee Kim, Youngjun Yoo, Seunghyun Park, Jinkyu Kim, and Jaekoo Lee. 2021. Selfreg: Self-supervised contrastive regularization for domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9619--9628.

[33]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, Vol. 25.

[34]

David Krueger, Ethan Caballero, Joern-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Dinghuai Zhang, Remi Le Priol, and Aaron Courville. 2021. Out-of-distribution generalization via risk extrapolation (rex). In International Conference on Machine Learning. 5815--5826.

[35]

Bo Li, Yifei Shen, Jingkang Yang, Yezhen Wang, Jiawei Ren, Tong Che, Jun Zhang, and Ziwei Liu. 2023. Sparse mixture-of-experts are domain generalizable learners. In International Conference on Learning Representations.

[36]

Da Li, Yongxin Yang, Yi-Zhe Song, and Timothy M. Hospedales. 2017. Deeper, broader and artier domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5543--5551.

[37]

Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, and Yu Qiao. 2024. VideoMamba: State space model for efficient video understanding. arxiv: 2403.06977

[38]

Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, and Chen Change Loy. 2024. OMG-Seg: Is one model good enough for all segmentation?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]

Ya Li, Xinmei Tian, Mingming Gong, Yajing Liu, Tongliang Liu, Kun Zhang, and Dacheng Tao. 2018. Deep Domain generalization via conditional invariant adversarial networks. In Proceedings of the European Conference on Computer Vision. 624--639.

Digital Library

[40]

Chang Liu, Lichen Wang, Kai Li, and Yun Fu. 2021. Domain generalization via feature variation decorrelation. In Proceedings of the ACM International Conference on Multimedia. 1683--1691.

Digital Library

[41]

Fengqi Liu, Jingyu Gong, Qianyu Zhou, Xuequan Lu, Ran Yi, Yuan Xie, and Lizhuang Ma. 2024. CloudMix: Dual mixup consistency for unpaired point cloud completion. IEEE Transactions on Visualization and Computer Graphics (2024).

[42]

Jiarun Liu, Hao Yang, Hong-Yu Zhou, Yan Xi, Lequan Yu, Yizhou Yu, Yong Liang, Guangming Shi, Shaoting Zhang, Hairong Zheng, et al. 2024. Swin-umamba: Mamba-based unet with imagenet-based pretraining. arXiv preprint arXiv:2402.03302 (2024).

[43]

Ye Liu, Lingfeng Qiao, Changchong Lu, Di Yin, Chen Lin, Haoyuan Peng, and Bo Ren. 2023. OSAN: A one-stage alignment network to unify multimodal alignment and unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3551--3560.

[44]

Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, Yaowei Wang, Qixiang Ye, and Yunfan Liu. 2024. Vmamba: Visual state space model. arXiv preprint arXiv:2401.10166 (2024).

[45]

Shaocong Long, Qianyu Zhou, Chenhao Ying, Lizhuang Ma, and Yuan Luo. 2024. Rethinking Domain Generalization: Discriminability and Generalizability. IEEE Transactions on Circuits and Systems for Video Technology (2024).

[46]

Divyat Mahajan, Shruti Tople, and Amit Sharma. 2021. Domain generalization using causal matching. In International Conference on Machine Learning. 7313--7324.

[47]

Jonathan Munro and Dima Damen. 2020. Multi-modal domain adaptation for fine-grained action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 122--132.

[48]

Hyeonseob Nam, HyunJae Lee, Jongchan Park, Wonjun Yoon, and Donggeun Yoo. 2021. Reducing domain gap by reducing style bias. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8690--8699.

[49]

Mehrdad Noori, Milad Cheraghalikhani, Ali Bahri, Gustavo A Vargas Hakim, David Osowiechi, Ismail Ben Ayed, and Christian Desrosiers. 2024. TFS-ViT: Token-level feature stylization for domain generalization. Pattern Recognition, Vol. 149 (2024), 110213.

Digital Library

[50]

Namuk Park and Songkuk Kim. 2022. How do vision transformers work? arXiv preprint arXiv:2202.06709 (2022).

[51]

Fan Qi, Xiaoshan Yang, and Changsheng Xu. 2018. A unified framework for multimodal domain adaptation. In Proceedings of the ACM international conference on Multimedia. 429--437.

Digital Library

[52]

Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, and Yoshua Bengio. 2021. Toward causal representation learning. Proc. IEEE, Vol. 109, 5 (2021), 612--634.

[53]

Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 618--626.

[54]

Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Siddhartha Chaudhuri, Preethi Jyothi, and Sunita Sarawagi. 2018. Generalizing across domains via cross-gradient training. In International Conference on Learning Representations.

[55]

Yiran Song, Qianyu Zhou, Xiangtai Li, Deng-Ping Fan, Xuequan Lu, and Lizhuang Ma. 2024. BA-SAM: Scalable bias-mode attention mask for segment anything model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]

Maryam Sultana, Muzammal Naseer, Muhammad Haris Khan, Salman Khan, and Fahad Shahbaz Khan. 2022. Self-distilled vision transformer for domain generalization. In Proceedings of the Asian Conference on Computer Vision. 3068--3085.

[57]

Zhaorui Tan, Xi Yang, and Kaizhu Huang. 2024. Rethinking multi-domain generalization with a general learning objective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[58]

Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research, Vol. 9, 11 (2008).

[59]

Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, and Sethuraman Panchanathan. 2017. Deep hashing network for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5385--5394.

[60]

Mengzhu Wang, Jianlong Yuan, Qi Qian, Zhibin Wang, and Hao Li. 2022. Semantic data augmentation based distance metric learning for domain generalization. In Proceedings of the ACM international conference on multimedia. 3214--3223.

Digital Library

[61]

Pengfei Wang, Zhaoxiang Zhang, Zhen Lei, and Lei Zhang. 2023. Sharpness-aware gradient matching for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3769--3778.

[62]

Ye Wang, Junyang Chen, Mengzhu Wang, Hao Li, Wei Wang, Houcheng Su, Zhihui Lai, Wei Wang, and Zhenghan Chen. 2023. A closer look at classifier in adversarial domain generalization. In Proceedings of the ACM International Conference on Multimedia. 280--289.

Digital Library

[63]

Yufei Wang, Haoliang Li, Hao Cheng, Bihan Wen, Lap-Pui Chau, and Alex Kot. 2022. Variational disentanglement for domain generalization. Transactions on Machine Learning Research (2022).

[64]

Ziyang Wang, Jian-Qing Zheng, Yichi Zhang, Ge Cui, and Lei Li. 2024. Mamba-unet: Unet-like pure visual mamba for medical image segmentation. arXiv preprint arXiv:2402.05079 (2024).

[65]

Jianzong Wu, Xiangtai Li, Shilin Xu, Haobo Yuan, Henghui Ding, Yibo Yang, Xia Li, Jiangning Zhang, Yunhai Tong, Xudong Jiang, Bernard Ghanem, and Dacheng Tao. 2024. Towards open vocabulary learning: A survey. T-PAMI (2024).

[66]

Renkai Wu, Yinghao Liu, Pengchen Liang, and Qing Chang. 2024. H-vmunet: High-order vision mamba UNet for medical image segmentation. arXiv preprint arXiv:2403.13642 (2024).

[67]

Yao Xiao, Ziyi Tang, Pengxu Wei, Cong Liu, and Liang Lin. 2023. Masked images are counterfactual samples for robust fine-tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20301--20310.

[68]

Qinwei Xu, Ruipeng Zhang, Ya Zhang, Yanfeng Wang, and Qi Tian. 2021. A fourier-based framework for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14383--14392.

[69]

Xufeng Yao, Yang Bai, Xinyun Zhang, Yuechen Zhang, Qi Sun, Ran Chen, Ruiyu Li, and Bei Yu. 2022. PCL: Proxy-based contrastive learning for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7097--7107.

[70]

Chongzhi Zhang, Mingyuan Zhang, Shanghang Zhang, Daisheng Jin, Qiang Zhou, Zhongang Cai, Haiyu Zhao, Xianglong Liu, and Ziwei Liu. 2022. Delving deep into the generalization of vision transformers under distribution shifts. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 7277--7286.

[71]

Hanlin Zhang, Yi-Fan Zhang, Weiyang Liu, Adrian Weller, Bernhard Schölkopf, and Eric P Xing. 2022. Towards principled disentanglement for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8024--8034.

[72]

Jiangning Zhang, Xiangtai Li, Jian Li, Liang Liu, Zhucun Xue, Boshen Zhang, Zhengkai Jiang, Tianxin Huang, Yabiao Wang, and Chengjie Wang. 2023. Rethinking mobile block for efficient attention-based models. In Proceedings of the IEEE/CVF International Conference on Computer Vision.

[73]

Jiangning Zhang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong Liu, Guansong Pang, and Dacheng Tao. 2024. Learning feature inversion for multi-class unsupervised anomaly detection under general-purpose COCO-AD benchmark. arXiv (2024).

[74]

Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, and Chelsea Finn. 2021. Adaptive risk minimization: Learning to adapt to domain shift. Advances in Neural Information Processing Systems, Vol. 34 (2021), 23664--23678.

[75]

Tao Zhang, Xiangtai Li, Haobo Yuan, Shunping Ji, and Shuicheng Yan. 2024. Point cloud mamba: Point cloud learning via state space model. arXiv preprint arXiv:2403.00762 (2024).

[76]

Yi-Fan Zhang, Jindong Wang, Jian Liang, Zhang Zhang, Baosheng Yu, Liang Wang, Dacheng Tao, and Xing Xie. 2023. Domain-specific risk minimization for domain generalization. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3409--3421.

Digital Library

[77]

Shanshan Zhao, Mingming Gong, Tongliang Liu, Huan Fu, and Dacheng Tao. 2020. Domain generalization via entropy regularization. In Advances in Neural Information Processing Systems, Vol. 33. 16096--16107.

[78]

Yuyang Zhao, Zhun Zhong, Fengxiang Yang, Zhiming Luo, Yaojin Lin, Shaozi Li, and Nicu Sebe. 2021. Learning to generalize unseen domains via memory-based multi-source meta-learning for person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6277--6286.

[79]

Yuyang Zhao, Zhun Zhong, Na Zhao, Nicu Sebe, and Gim Hee Lee. 2024. Style-hallucinated dual consistency learning: A unified framework for visual domain generalization. International Journal of Computer Vision, Vol. 132, 3 (2024), 837--853.

Digital Library

[80]

Zangwei Zheng, Xiangyu Yue, Kai Wang, and Yang You. 2022. Prompt vision transformer for domain generalization. arXiv preprint arXiv:2208.08914 (2022).

[81]

Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, and Chen Change Loy. 2022. Domain generalization: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 4 (2022), 4396--4415.

[82]

Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, and Tao Xiang. 2020. Learning to generate novel domains for domain generalization. In European Conference on Computer Vision. 561--578.

Digital Library

[83]

Kaiyang Zhou, Yongxin Yang, Yu Qiao, and Tao Xiang. 2021. Domain generalization with mixstyle. In International Conference on Learning Representations.

[84]

Qianyu Zhou, Zhengyang Feng, Qiqi Gu, Jiangmiao Pang, Guangliang Cheng, Xuequan Lu, Jianping Shi, and Lizhuang Ma. 2023. Context-aware mixup for domain adaptive semantic segmentation. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 33, 2 (2023), 804--817.

[85]

Qianyu Zhou, Qiqi Gu, Jiangmiao Pang, Xuequan Lu, and Lizhuang Ma. 2023. Self-adversarial disentangling for specific domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 7 (2023), 8954--8968.

Digital Library

[86]

Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Shouhong Ding, and Lizhuang Ma. 2024. Test-Time Domain Generalization for Face Anti-Spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[87]

Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Ran Yi, Shouhong Ding, and Lizhuang Ma. 2023. Instance-Aware Domain Generalization for Face Anti-Spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20453--20463.

[88]

Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Shouhong Ding, and Lizhuang Ma. 2022. Adaptive mixture of experts learning for generalizable face anti-spoofing. In Proceedings of the ACM International Conference on Multimedia. 6009--6018.

Digital Library

[89]

Qianyu Zhou, Chuyun Zhuang, Ran Yi, Xuequan Lu, and Lizhuang Ma. 2022. Domain adaptive semantic segmentation via regional contrastive consistency regularization. In IEEE International Conference on Multimedia and Expo. 01--06.

[90]

Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, and Xinggang Wang. 2024. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv preprint arXiv:2401.09417 (2024).

Index Terms

DGMamba: Domain Generalization via Generalized State Space Model
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations

Recommendations

Category-Stitch Learning for Union Domain Generalization
Domain generalization aims at generalizing the network trained on multiple domains to unknown but related domains. Under the assumption that different domains share the same classes, previous works can build relationships across domains. However, in ...
Learning to Balance Specificity and Invariance for In and Out of Domain Generalization
Computer Vision – ECCV 2020
Abstract
We introduce Domain-specific Masks for Generalization, a model for improving both in-domain and out-of-domain generalization performance. For domain generalization, the goal is to learn from a set of source domains to produce a single model that ...
Domain Attention Model for Domain Generalization in Object Detection
Pattern Recognition and Computer Vision
Abstract
Domain generalization methods in object detection aim to learn a domain-invariant detector for different domains. However, it is difficult to obtain a domain-invariant detector when there is large discrepancy between different domains. Based on ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R\&D Program of China

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
34
Total Downloads

Downloads (Last 12 months)34
Downloads (Last 6 weeks)34

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents