research-article

RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining

Authors:

Lei ZhuAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 7881 - 7890

https://doi.org/10.1145/3664647.3680916

Published: 28 October 2024 Publication History

Abstract

The outdoor vision systems are frequently contaminated by rain streaks and raindrops, which significantly degenerate the performance of visual tasks and multimedia applications. The nature of videos exhibits redundant temporal cues for rain removal with higher stability. Traditional video deraining methods heavily rely on optical flow estimation and kernel-based manners, which have a limited receptive field. Yet, transformer architectures, while enabling long-term dependencies, bring about a significant increase in computational complexity. Recently, the linear-complexity operator of the state space models (SSMs) has contrarily facilitated efficient long-term temporal modeling, which is crucial for rain streaks and raindrops removal in videos. Unexpectedly, its uni-dimensional sequential process on videos destroys the local correlations across the spatio-temporal dimension by distancing adjacent pixels. To address this, we present an improved SSMs-based video deraining network (RainMamba) with a novel Hilbert scanning mechanism to better capture sequence-level local information. We also introduce a difference-guided dynamic contrastive locality learning strategy to enhance the patch-level self-similarity learning ability of the proposed network. Extensive experiments on four synthesized video deraining datasets and real-world rainy videos demonstrate the superiority of our network in the removal of rain streaks and raindrops. Our code and results are available at https://github.com/TonyHongtaoWu/RainMamba.

References

[1]

Stefano Alletto, Casey Carlin, Luca Rigazio, Yasunori Ishii, and Sotaro Tsukizawa. 2019. Adherent raindrop removal with self-supervised attention maps and spatio-temporal generative adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 0--0.

[2]

Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Luvcić, and Cordelia Schmid. 2021. Vivit: A video vision transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 6836--6846.

[3]

Tetsuo Asano, Desh Ranjan, Thomas Roos, Emo Welzl, and Peter Widmayer. 1997. Space-filling curves and their use in the design of geometric data structures. Theoretical Computer Science, Vol. 181, 1 (1997), 3--15.

Digital Library

[4]

Zoran Balkić, Damir vSovstarić, and Goran Horvat. 2012. GeoHash and UUID identifier for multi-agent systems. In Agent and Multi-Agent Systems. Technologies and Applications: 6th KES International Conference, KES-AMSTA 2012, Dubrovnik, Croatia, June 25--27, 2012. Proceedings 6. Springer, 290--298.

Digital Library

[5]

Peter C Barnum, Srinivasa Narasimhan, and Takeo Kanade. 2010. Analysis of rain and snow in frequency space. International journal of computer vision, Vol. 86 (2010), 256--274.

Digital Library

[6]

Konstantin Evgen'evich Bauman. 2006. The dilation factor of the Peano-Hilbert curve. Mathematical Notes, Vol. 80 (2006), 609--620.

[7]

Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. 2009. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning. 41--48.

Digital Library

[8]

Kelvin CK Chan, Shangchen Zhou, Xiangyu Xu, and Chen Change Loy. 2022. BasicVSR: Improving video super-resolution with enhanced propagation and alignment. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5972--5981.

[9]

Pierre Charbonnier, Laure Blanc-Feraud, Gilles Aubert, and Michel Barlaud. 1994. Two deterministic half-quadratic regularization algorithms for computed imaging. In Proceedings of 1st international conference on image processing, Vol. 2. IEEE, 168--172.

[10]

Haoyu Chen, Jingjing Ren, Jinjin Gu, Hongtao Wu, Xuequan Lu, Haoming Cai, and Lei Zhu. 2023. Snow Removal in Video: A New Dataset and A Novel Method. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13211--13222.

[11]

Jie Chen, Cheen-Hau Tan, Junhui Hou, Lap-Pui Chau, and He Li. 2018. Robust video content alignment and compensation for rain removal in a cnn framework. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6286--6295.

[12]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597--1607.

[13]

Wanli Chen, Xinge Zhu, Guojin Chen, and Bei Yu. 2022. Efficient point cloud analysis using hilbert curve. In European Conference on Computer Vision. Springer, 730--747.

Digital Library

[14]

Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2147--2156.

[15]

Xiang Chen, Hao Li, Mingqiang Li, and Jinshan Pan. 2023. Learning a sparse transformer network for effective image deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5896--5905.

[16]

Xiang Chen, Jinshan Pan, Kui Jiang, Yufeng Li, Yufeng Huang, Caihua Kong, Longgang Dai, and Zhentao Fan. 2022. Unpaired deep image deraining using dual contrastive learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2017--2026.

[17]

Junkai Fan, Jiangwei Weng, Kun Wang, Yijun Yang, Jianjun Qian, Jun Li, and Jian Yang. 2024. Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 26109--26119.

[18]

Xueyang Fu, Jiabin Huang, Delu Zeng, Yue Huang, Xinghao Ding, and John Paisley. 2017. Removing rain from single images via a deep detail network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3855--3863.

[19]

Kshitiz Garg and Shree K Nayar. 2004. Detection and removal of rain from videos. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Vol. 1. IEEE, I--I.

[20]

Albert Gu and Tri Dao. 2023. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023).

[21]

Albert Gu, Karan Goel, and Christopher Ré. 2021. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396 (2021).

[22]

Albert Gu, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, and Christopher Ré. 2021. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in neural information processing systems, Vol. 34 (2021), 572--585.

[23]

Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, and Shu-Tao Xia. 2024. MambaIR: A Simple Baseline for Image Restoration with State-Space Model. arXiv preprint arXiv:2402.15648 (2024).

[24]

Junlin Han, Mehrdad Shoeiby, Lars Petersson, and Mohammad Ali Armin. 2021. Dual contrastive learning for unsupervised image-to-image translation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 746--755.

[25]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729--9738.

[26]

David Hilbert. 1935. Über die stetige Abbildung einer Linie auf ein Flächenstück. Springer Berlin Heidelberg, Berlin, Heidelberg, 1--2. https://doi.org/10.1007/978--3--662--38452--7_1

[27]

Cong Huang, Jiahao Li, Bin Li, Dong Liu, and Yan Lu. 2022. Neural compression-based feature learning for video restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5872--5881.

[28]

Tao Huang, Xiaohuan Pei, Shan You, Fei Wang, Chen Qian, and Chang Xu. 2024. Localmamba: Visual state space model with windowed selective scan. arXiv preprint arXiv:2403.09338 (2024).

[29]

Hosagrahar V Jagadish. 1990. Linear clustering of objects with multiple attributes. In Proceedings of the 1990 ACM SIGMOD international conference on Management of data. 332--342.

Digital Library

[30]

Tai-Xiang Jiang, Ting-Zhu Huang, Xi-Le Zhao, Liang-Jian Deng, and Yao Wang. 2018. Fastderain: A novel video rain streak removal method using directional gradient priors. IEEE Transactions on Image Processing, Vol. 28, 4 (2018), 2089--2102.

Digital Library

[31]

Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Computer Vision--ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part II 14. Springer, 694--711.

[32]

Rudolph Emil Kalman. 1960. A new approach to linear filtering and prediction problems. (1960).

[33]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[34]

Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, and Yu Qiao. 2024. VideoMamba: State Space Model for Efficient Video Understanding. arXiv preprint arXiv:2403.06977 (2024).

[35]

Minghan Li, Qi Xie, Qian Zhao, Wei Wei, Shuhang Gu, Jing Tao, and Deyu Meng. 2018. Video rain streak removal by multiscale convolutional sparse coding. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6644--6653.

[36]

Jingyun Liang, Yuchen Fan, Xiaoyu Xiang, Rakesh Ranjan, Eddy Ilg, Simon Green, Jiezhang Cao, Kai Zhang, Radu Timofte, and Luc V Gool. 2022. Recurrent video restoration transformer with guided deformable attention. Advances in Neural Information Processing Systems, Vol. 35 (2022), 378--393.

[37]

Jan-Yie Liang, Chih-Sheng Chen, Chua-Huang Huang, and Li Liu. 2008. Lossless compression of medical images using Hilbert space-filling curves. Computerized Medical Imaging and Graphics, Vol. 32, 3 (2008), 174--182.

[38]

Yuanchu Liang, Saeed Anwar, and Yang Liu. 2022. Drt: A lightweight single image deraining recursive transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 589--598.

[39]

Ding Liu, Bihan Wen, Yuchen Fan, Chen Change Loy, and Thomas S Huang. 2018. Non-local recurrent network for image restoration. Advances in neural information processing systems, Vol. 31 (2018).

[40]

Jiaying Liu, Wenhan Yang, Shuai Yang, and Zongming Guo. 2018. Erase or fill? deep joint recurrent rain removal and reconstruction in videos. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3233--3242.

[41]

Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, Yaowei Wang, Qixiang Ye, and Yunfan Liu. 2024. Vmamba: Visual state space model. arXiv preprint arXiv:2401.10166 (2024).

[42]

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. 2022. A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11976--11986.

[43]

Mohamed F Mokbel, Walid G Aref, and Ibrahim Kamel. 2003. Analysis of multi-dimensional space-filling curves. GeoInformatica, Vol. 7 (2003), 179--209.

Digital Library

[44]

Bongki Moon, Hosagrahar V Jagadish, Christos Faloutsos, and Joel H. Saltz. 2001. Analysis of the clustering properties of the Hilbert space-filling curve. IEEE Transactions on knowledge and data engineering, Vol. 13, 1 (2001), 124--141.

Digital Library

[45]

Eric Nguyen, Karan Goel, Albert Gu, Gordon Downs, Preey Shah, Tri Dao, Stephen Baccus, and Christopher Ré. 2022. S4nd: Modeling images and videos as multidimensional signals with state spaces. Advances in neural information processing systems, Vol. 35 (2022), 2846--2861.

[46]

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).

[47]

Ruijie Quan, Xin Yu, Yuanzhi Liang, and Yi Yang. 2021. Removing raindrops and rain streaks in one go. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9147--9156.

[48]

Dongwei Ren, Wangmeng Zuo, Qinghua Hu, Pengfei Zhu, and Deyu Meng. 2019. Progressive image deraining networks: A better and simpler baseline. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3937--3946.

[49]

Jingjing Ren, Haoyu Chen, Tian Ye, Hongtao Wu, and Lei Zhu. 2024. Triplane-Smoothed Video Dehazing with CLIP-Enhanced Generalization. International Journal of Computer Vision (2024), 1--14.

[50]

Weihong Ren, Jiandong Tian, Zhi Han, Antoni Chan, and Yandong Tang. 2017. Video desnowing and deraining based on matrix decomposition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4210--4219.

[51]

Hans Sagan. 2012. Space-filling curves. Springer Science & Business Media.

[52]

Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, and Wenming Yang. 2024. VmambaIR: Visual State Space Model for Image Restoration. arXiv preprint arXiv:2403.11423 (2024).

[53]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[54]

Shangquan Sun, Wenqi Ren, Jingzhi Li, Kaihao Zhang, Meiyu Liang, and Xiaochun Cao. 2023. Event-aware video deraining via multi-patch progressive learning. IEEE Transactions on Image Processing (2023).

Digital Library

[55]

Hongqiu Wang, Jian Chen, Shichen Zhang, Yuan He, Jinfeng Xu, Mengwan Wu, Jinlan He, Wenjun Liao, and Xiangde Luo. 2024. Dual-reference source-free active domain adaptation for nasopharyngeal carcinoma tumor segmentation across multiple hospitals. IEEE Transactions on Medical Imaging (2024).

[56]

Hongqiu Wang, Yueming Jin, and Lei Zhu. 2023. Dynamic Interactive Relation Capturing via Scene Graph Learning for Robotic Surgical Report Generation. In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2702--2709.

[57]

Hongqiu Wang, Guang Yang, Shichen Zhang, Jing Qin, Yike Guo, Bo Xu, Yueming Jin, and Lei Zhu. 2024. Video-instrument synergistic network for referring video instrument segmentation in robotic surgery. IEEE Transactions on Medical Imaging (2024).

[58]

Jin Wang, Wenming Weng, Yueyi Zhang, and Zhiwei Xiong. 2023. Unsupervised Video Deraining with An Event Camera. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10831--10840.

[59]

Shuai Wang, Lei Zhu, Huazhu Fu, Jing Qin, Carola-Bibiane Schönlieb, Wei Feng, and Song Wang. 2022. Rethinking Video Rain Streak Removal: A New Synthesis Model and a Deraining Network with Video Rain Prior. In Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XIX. Springer, 565--582.

Digital Library

[60]

Tianyu Wang, Xin Yang, Ke Xu, Shaozhe Chen, Qiang Zhang, and Rynson WH Lau. 2019. Spatial attentive single-image deraining with a high quality real rain dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12270--12279.

[61]

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. Video-to-video synthesis. arXiv preprint arXiv:1808.06601 (2018).

Digital Library

[62]

Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7794--7803.

[63]

Yanbo Wang, Shaohui Lin, Yanyun Qu, Haiyan Wu, Zhizhong Zhang, Yuan Xie, and Angela Yao. 2021. Towards compact single image super-resolution via contrastive self-distillation. arXiv preprint arXiv:2105.11683 (2021).

[64]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, Vol. 13, 4 (2004), 600--612.

Digital Library

[65]

Qiang Wen, Yue Wu, and Qifeng Chen. 2023. Video Waterdrop Removal via Spatio-Temporal Fusion in Driving Scenes. arXiv preprint arXiv:2302.05916 (2023).

[66]

Haiyan Wu, Yanyun Qu, Shaohui Lin, Jian Zhou, Ruizhi Qiao, Zhizhong Zhang, Yuan Xie, and Lizhuang Ma. 2021. Contrastive learning for compact single image dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10551--10560.

[67]

Hongtao Wu, Yijun Yang, Angelica Aviles-Rivero, Jingjing Ren, Sixiang chen, Haoyu Chen, and Lei Zhu. 2024. Semi-Supervised Video Desnowing Network via Temporal Decoupling Experts and Distribution-Driven Contrastive Regularization. In European Conference on Computer Vision.

[68]

Hongtao Wu, Yijun Yang, Haoyu Chen, Jingjing Ren, and Lei Zhu. 2023. Mask-Guided Progressive Network for Joint Raindrop and Rain Streak Removal in Videos. In Proceedings of the 31st ACM International Conference on Multimedia. 7216--7225.

Digital Library

[69]

Zhaohu Xing, Tian Ye, Yijun Yang, Guang Liu, and Lei Zhu. 2024. Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation. arXiv preprint arXiv:2401.13560 (2024).

[70]

Huihui Xu, Yijun Yang, Angelica I. Avilés-Rivero, Guang Yang, Jing Qin, and Lei Zhu. 2024. LGRNet: Local-Global Reciprocal Network for Uterine Fibroid Segmentation in Ultrasound Videos. https://api.semanticscholar.org/CorpusID:271050719

[71]

Wending Yan, Robby T Tan, Wenhan Yang, and Dengxin Dai. 2021. Self-aligned video deraining with transmission-depth consistency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11966--11976.

[72]

Wenhan Yang, Jiaying Liu, and Jiashi Feng. 2019. Frame-consistent recurrent video deraining with dual-level flow. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1661--1670.

[73]

Wenhan Yang, Robby T Tan, Jiashi Feng, Shiqi Wang, Bin Cheng, and Jiaying Liu. 2021. Recurrent multi-frame deraining: Combining physics guidance and adversarial learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 44, 11 (2021), 8569--8586.

[74]

Wenhan Yang, Robby T Tan, Shiqi Wang, and Jiaying Liu. 2020. Self-learning video rain streak removal: When cyclic consistency meets temporal correspondence. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1720--1729.

[75]

Yijun Yang, Angelica I Aviles-Rivero, Huazhu Fu, Ye Liu, Weiming Wang, and Lei Zhu. 2023. Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13200--13210.

[76]

Yijun Yang, Shujun Wang, Lihao Liu, Sarah Hickman, Fiona J Gilbert, Carola-Bibiane Schönlieb, and Angelica I Aviles-Rivero. 2023. MammoDG: Generalisable Deep Learning Breaks the Limits of Cross-Domain Multi-Center Breast Cancer Screening. arXiv preprint arXiv:2308.01057 (2023).

[77]

Yijun Yang, Hongtao Wu, Angelica I Aviles-Rivero, Yulun Zhang, Jing Qin, and Lei Zhu. 2024. Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for Video Adverse Weather Removal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 25606--25616.

[78]

Yijun Yang, Zhaohu Xing, and Lei Zhu. 2024. Vivim: a video vision mamba for medical video object segmentation. arXiv preprint arXiv:2401.14168 (2024).

[79]

Tian Ye, Sixiang Chen, Yun Liu, Wenhao Chai, Jinbin Bai, Wenbin Zou, Yunchen Zhang, Mingchao Jiang, Erkang Chen, and Chenghao Xue. 2023. Sequential Affinity Learning for Video Restoration. In Proceedings of the 31st ACM International Conference on Multimedia. 4147--4156.

Digital Library

[80]

Yuntong Ye, Changfeng Yu, Yi Chang, Lin Zhu, Xi-Le Zhao, Luxin Yan, and Yonghong Tian. 2022. Unsupervised deraining: Where contrastive learning meets self-similarity. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5821--5830.

[81]

Shaodi You, Robby T Tan, Rei Kawakami, Yasuhiro Mukaigawa, and Katsushi Ikeuchi. 2015. Adherent raindrop modeling, detection and removal in video. IEEE transactions on pattern analysis and machine intelligence, Vol. 38, 9 (2015), 1721--1733.

[82]

Kaishen Yuan, Zitong Yu, Xin Liu, Weicheng Xie, Huanjing Yue, and Jingyu Yang. 2024. Auformer: Vision transformers are parameter-efficient facial action unit detectors. arXiv preprint arXiv:2403.04697 (2024).

[83]

Zongsheng Yue, Jianwen Xie, Qian Zhao, and Deyu Meng. 2021. Semi-supervised video deraining with dynamical rain generator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 642--652.

[84]

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2022. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5728--5739.

[85]

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. 2021. Multi-stage progressive image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14821--14831.

[86]

Kaihao Zhang, Dongxu Li, Wenhan Luo, and Wenqi Ren. 2021. Dual attention-in-attention model for joint rain streak and raindrop removal. IEEE Transactions on Image Processing, Vol. 30 (2021), 7608--7619.

[87]

Kaihao Zhang, Dongxu Li, Wenhan Luo, Wenqi Ren, and Wei Liu. 2022. Enhanced spatio-temporal interaction learning for video deraining: faster and better. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, 1 (2022), 1287--1293.

[88]

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586--595.

[89]

Yujun Zhang, Lei Zhu, Wei Feng, Huazhu Fu, Mingqian Wang, Qingxia Li, Cheng Li, and Song Wang. 2021. Vil-100: A new dataset and a baseline model for video instance lane detection. In Proceedings of the IEEE/CVF international conference on computer vision. 15681--15690.

[90]

Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, and Xinggang Wang. 2024. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model. arXiv preprint arXiv:2401.09417 (2024).

Cited By

Zhou HWang HYe TXing ZMa JLi PWang QZhu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Timeline and Boundary Guided Diffusion Network for Video Shadow DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681236(166-175)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681236
Wang HWang WZhou HXu HWu SZhu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Language-Driven Interactive Shadow DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681192(5527-5536)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681192

Index Terms

RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks

Recommendations

Mask-Guided Progressive Network for Joint Raindrop and Rain Streak Removal in Videos
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Videos captured in rainy weather are unavoidably corrupted by both rain streaks and raindrops in driving scenarios, and it is desirable and challenging to recover background details obscured by rain streaks and raindrops. However, existing video rain ...
Performance of state space and ARIMA models for consumer retail sales forecasting

Forecasting future sales is one of the most important issues that is beyond all strategic and planning decisions in effective operations of retail businesses. For profitable retail businesses, accurate demand forecasting is crucial in organizing and ...
Video Deraining and Desnowing Using Temporal Correlation and Low-Rank Matrix Completion
A novel algorithm to remove rain or snow streaks from a video sequence using temporal correlation and low-rank matrix completion is proposed in this paper. Based on the observation that rain streaks are too small and move too fast to affect the optical ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Nansha Key Area Science and Technology Project
Guangzhou-HKUST(GZ) Joint Funding Program
Guangzhou Municipal Science and Technology Project
Guangzhou Industrial Information and Intelligent Key Laboratory Project

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
118
Total Downloads

Downloads (Last 12 months)118
Downloads (Last 6 weeks)46

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhou HWang HYe TXing ZMa JLi PWang QZhu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Timeline and Boundary Guided Diffusion Network for Video Shadow DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681236(166-175)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681236
Wang HWang WZhou HXu HWu SZhu LCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Language-Driven Interactive Shadow DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681192(5527-5536)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681192

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents