skip to main content
10.1145/3664647.3681510acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Cross-Modal Meta Consensus for Heterogeneous Federated Learning

Published: 28 October 2024 Publication History

Abstract

In the evolving landscape of federated learning (FL), the integration of multimodal data presents both unprecedented opportunities and significant challenges. Existing works fall short of meeting the growing demand for systems that can efficiently handle diverse tasks and modalities in rapidly changing environments. We propose a meta learning strategy tailored for Multimodal Federated Learning in a multitask setting, which harmonizes intra-modal and inter-modal feature spaces through the Cross-Modal Meta Consensus. This innovative approach enables seamless integration and transfer of knowledge across different data types, enhancing task personalization within modalities and facilitating effective cross-modality knowledge sharing. Additionally, we introduce Gradient Consistency-based Clustering for multimodal convergence, specifically designed to resolve conflicts at meta-initialization points arising from diverse modality distributions, supported by theoretical guarantees. Our approach, evaluated as M3Fed on five federated datasets, with at most four modalities and four downstream tasks, demonstrates strong performance across diverse data distributions, affirming its effectiveness in Multimodal Federated Learning.

References

[1]
P-A Absil, Robert Mahony, and Rodolphe Sepulchre. 2004. Riemannian geometry of Grassmann manifolds with a view on algorithmic computation. Acta Applicandae Mathematica 80 (2004), 199--220.
[2]
Omid Aramoon, Pin-Yu Chen, Gang Qu, and Yuan Tian. 2024. Meta-federated learning. In Federated Learning. Elsevier, 161--179.
[3]
Ruisi Cai, Xiaohan Chen, Shiwei Liu, Jayanth Srinivasa, Myungjin Lee, Ramana Kompella, and Zhangyang Wang. 2023. Many-task federated learning: A new problem setting and a simple baseline. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5036--5044.
[4]
Paul H Calamai and Jorge J Moré. 1987. Projected gradient methods for linearly constrained problems. Mathematical programming 39, 1 (1987), 93--116.
[5]
Xiaoyu Cao, Minghong Fang, Jia Liu, and Neil Zhenqiang Gong. 2020. Fltrust: Byzantine-robust federated learning via trust bootstrapping. arXiv preprint arXiv:2012.13995 (2020).
[6]
Liwei Che, Jiaqi Wang, Yao Zhou, and Fenglong Ma. 2023. Multimodal federated learning: A survey. Sensors 23, 15 (2023), 6986.
[7]
Bingyang Chen, Tao Chen, Xingjie Zeng,Weishan Zhang, Qinghua Lu, Zhaoxiang Hou, Jiehan Zhou, and Sumi Helal. 2023. DFML: Dynamic federated meta-learning for rare disease prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2023).
[8]
Donghua Chen and Runtong Zhang. 2023. Building multimodal knowledge bases with multimodal computational sequences and generative adversarial networks. IEEE Transactions on Multimedia (2023).
[9]
Fei Chen, Mi Luo, Zhenhua Dong, Zhenguo Li, and Xiuqiang He. 2018. Federated meta-learning with fast convergence and efficient communication. arXiv preprint arXiv:1802.07876 (2018).
[10]
Haokun Chen, Yao Zhang, Denis Krompass, Jindong Gu, and Volker Tresp. 2024. Feddat: An approach for foundation model finetuning in multi-modal heterogeneous federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 11285--11293.
[11]
Jiayi Chen and Aidong Zhang. 2022. Fedmsplit: Correlation-adaptive federated multi-task learning across multimodal split networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 87--96.
[12]
Jiayi Chen and Aidong Zhang. 2024. On Disentanglement of Asymmetrical Knowledge Transfer for Modality-Task Agnostic Federated Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 11311--11319.
[13]
Li-Wei Chen and Alexander Rudnicky. 2023. Exploring wav2vec 2.0 fine tuning for improved speech emotion recognition. In ICASSP 2023--2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.
[14]
Yae Jee Cho, Andre Manoel, Gauri Joshi, Robert Sim, and Dimitrios Dimitriadis. 2022. Heterogeneous ensemble knowledge transfer for training large models in federated learning. arXiv preprint arXiv:2204.12703 (2022).
[15]
Sang Choe, Sanket Vaibhav Mehta, Hwijeen Ahn, Willie Neiswanger, Pengtao Xie, Emma Strubell, and Eric Xing. 2024. Making scalable meta learning practical. Advances in neural information processing systems 36 (2024).
[16]
Israel Cohen, Yiteng Huang, Jingdong Chen, Jacob Benesty, Jacob Benesty, Jingdong Chen, Yiteng Huang, and Israel Cohen. 2009. Pearson correlation coefficient. Noise reduction in speech processing (2009), 1--4.
[17]
Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Sanja Fidler, Antonino Furnari, Evangelos Kazakos, Davide Moltisanti, Jonathan Munro, Toby Perrett, Will Price, et al. 2018. Scaling egocentric vision: The epic-kitchens dataset. In Proceedings of the European conference on computer vision (ECCV). 720--736.
[18]
Alireza Fallah, Aryan Mokhtari, and Asuman Ozdaglar. 2020. Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach. Advances in Neural Information Processing Systems 33 (2020), 3557--3568.
[19]
Tiantian Feng, Digbalay Bose, Tuo Zhang, Rajat Hebbar, Anil Ramakrishna, Rahul Gupta, Mi Zhang, Salman Avestimehr, and Shrikanth Narayanan. 2023. Fedmultimodal: A benchmark for multimodal federated learning. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4035--4045.
[20]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic metalearning for fast adaptation of deep networks. In International conference on machine learning. PMLR, 1126--1135.
[21]
Gene H Golub and Charles F Van Loan. 2013. Matrix computations. JHU press.
[22]
Timothy Hospedales, Antreas Antoniou, Paul Micaelli, and Amos Storkey. 2021. Meta-learning in neural networks: A survey. IEEE transactions on pattern analysis and machine intelligence 44, 9 (2021), 5149--5169.
[23]
Yihan Jiang, Jakub Koneny, Keith Rush, and Sreeram Kannan. 2019. Improving federated learning personalization via model agnostic meta learning. arXiv preprint arXiv:1909.12488 (2019).
[24]
Hirokatsu Kataoka, Tenga Wakamiya, Kensho Hara, and Yutaka Satoh. 2020. Would mega-scale datasets further enhance spatiotemporal 3D CNNs? arXiv preprint arXiv:2004.04968 (2020).
[25]
Brenden M Lake and Marco Baroni. 2023. Human-like systematic generalization through a meta-learning neural network. Nature 623, 7985 (2023), 115--121.
[26]
John M Lee and John M Lee. 2012. Smooth manifolds. Springer.
[27]
Chenglin Li, Di Niu, Bei Jiang, Xiao Zuo, and Jianming Yang. 2021. Meta-har: Federated representation learning for human activity recognition. In Proceedings of the web conference 2021. 912--922.
[28]
Sen Lin, Guang Yang, and Junshan Zhang. 2020. A collaborative learning framework via federated meta-learning. In 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). IEEE, 289--299.
[29]
Sen Lin, Li Yang, Zhezhi He, Deliang Fan, and Junshan Zhang. 2021. Metagater: Fast learning of conditional channel gated networks via federated meta-learning. In 2021 IEEE 18th International Conference on Mobile Ad Hoc and Smart Systems (MASS). IEEE, 164--172.
[30]
Tao Lin, Lingjing Kong, Sebastian U Stich, and Martin Jaggi. 2020. Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems 33 (2020), 2351--2363.
[31]
Yi-Ming Lin, Yuan Gao, Mao-Guo Gong, Si-Jia Zhang, Yuan-Qiao Zhang, and Zhi-Yuan Li. 2023. Federated learning on multimodal data: A comprehensive survey. Machine Intelligence Research 20, 4 (2023), 539--553.
[32]
Siao Liu, Zhaoyu Chen, Yang Liu, Yuzheng Wang, Dingkang Yang, Zhile Zhao, Ziqing Zhou, Xie Yi, Wei Li, Wenqiang Zhang, et al. 2023. Improving generalization in visual reinforcement learning via conflict-aware gradient agreement augmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 23436--23446.
[33]
Wei Liu, Jie-Lin Qiu, Wei-Long Zheng, and Bao-Liang Lu. 2021. Comparing recognition performance and robustness of multimodal deep learning models for multimodal emotion recognition. IEEE Transactions on Cognitive and Developmental Systems 14, 2 (2021), 715--729.
[34]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 1273--1282.
[35]
Ali Mollahosseini, Behzad Hasani, and Mohammad H Mahoor. 2017. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing 10, 1 (2017), 18--31.
[36]
Arsha Nagrani, Shan Yang, Anurag Arnab, Aren Jansen, Cordelia Schmid, and Chen Sun. 2021. Attention bottlenecks for multimodal fusion. Advances in neural information processing systems 34 (2021), 14200--14213.
[37]
Younghyun Park, Dong-Jun Han, Do-Yeon Kim, Jun Seo, and Jaekyun Moon. 2021. Few-round learning for federated learning. Advances in Neural Information Processing Systems 34 (2021), 28612--28622.
[38]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
[39]
Pian Qi, Diletta Chiaro, and Francesco Piccialli. 2023. FL-FD: Federated learning based fall detection with multimodal data fusion. Information Fusion 99 (2023), 101890.
[40]
Xiaorong Qin, Xinhang Song, and Shuqiang Jiang. 2023. Bi-Level Meta-Learning for Few-Shot Domain Generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 15900--15910.
[41]
Haoyu Ren, Darko Anicic, and Thomas A Runkler. 2023. Tinyreptile: Tinyml with federated meta-learning. In 2023 International Joint Conference on Neural Networks (IJCNN). IEEE, 1--9.
[42]
Andrey V Savchenko. 2022. MT-EmotiEffNet for multi-task human affective behavior analysis and learning from synthetic data. In European Conference on Computer Vision. Springer, 45--59.
[43]
Yifan Shi, Yingqi Liu, Kang Wei, Li Shen, Xueqian Wang, and Dacheng Tao. 2023. Make landscape flatter in differentially private federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 24552--24562.
[44]
Khurram Soomro, Amir Roshan Zamir, and Mubarak Shah. 2012. UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012).
[45]
Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and computing 17 (2007), 395--416.
[46]
Kaisiyuan Wang, Qianyi Wu, Linsen Song, Zhuoqian Yang, Wayne Wu, Chen Qian, Ran He, Yu Qiao, and Chen Change Loy. 2020. Mead: A large-scale audiovisual dataset for emotional talking-face generation. In European Conference on Computer Vision. Springer, 700--717.
[47]
Song Wang, Xingbo Fu, Kaize Ding, Chen Chen, Huiyuan Chen, and Jundong Li. 2023. Federated few-shot learning. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2374--2385.
[48]
Kang Wei, Jun Li, Chuan Ma, Ming Ding, Wen Chen, Jun Wu, Meixia Tao, and H Vincent Poor. 2023. Personalized federated learning with differential privacy and convergence guarantee. IEEE Transactions on Information Forensics and Security (2023).
[49]
Jie Wen, Zhixia Zhang, Yang Lan, Zhihua Cui, Jianghui Cai, and Wensheng Zhang. 2023. A survey on federated learning: challenges and applications. International Journal of Machine Learning and Cybernetics 14, 2 (2023), 513--535.
[50]
Xidong Wu, Feihu Huang, Zhengmian Hu, and Heng Huang. 2023. Faster adaptive federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 10379--10387.
[51]
Baochen Xiong, Xiaoshan Yang, Fan Qi, and Changsheng Xu. 2022. A unified framework for multi-modal federated learning. Neurocomputing 480 (2022), 110--118.
[52]
Yuanhao Xiong, RuochenWang, Minhao Cheng, Felix Yu, and Cho-Jui Hsieh. 2023. Feddm: Iterative distribution matching for communication-efficient federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16323--16332.
[53]
Lei Yang, Jiaming Huang, Wanyu Lin, and Jiannong Cao. 2023. Personalized federated learning on non-iid data via group-based meta-learning. ACM Transactions on Knowledge Discovery from Data 17, 4 (2023), 1--20.
[54]
Xiaoshan Yang, Baochen Xiong, Yi Huang, and Changsheng Xu. 2024. Cross-Modal Federated Human Activity Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).
[55]
Xin Yao, Tianchi Huang, Rui-Xiao Zhang, Ruiyu Li, and Lifeng Sun. 2019. Federated learning with unbiased gradient aggregation and controllable meta updating. arXiv preprint arXiv:1910.08234 (2019).
[56]
Songcan Yu, JunboWang,Walid Hussein, and Patrick CK Hung. 2024. Robust multimodal federated learning for incomplete modalities. Computer Communications 214 (2024), 234--243.
[57]
Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. 2020. Gradient surgery for multi-task learning. Advances in Neural Information Processing Systems 33 (2020), 5824--5836.
[58]
Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Nghia Hoang, and Yasaman Khazaeni. 2019. Bayesian nonparametric federated learning of neural networks. In International conference on machine learning. PMLR, 7252--7261.
[59]
Jie Zhang, Zhiqi Li, Bo Li, Jianghe Xu, Shuang Wu, Shouhong Ding, and Chao Wu. 2022. Federated learning with label distribution skew via logits calibration. In International Conference on Machine Learning. PMLR, 26311--26329.
[60]
Rongyu Zhang, Xiaowei Chi, Guiliang Liu, Wenyi Zhang, Yuan Du, and Fangxin Wang. 2023. Unimodal Training-Multimodal Prediction: Cross-modal Federated Learning with Hierarchical Aggregation. arXiv preprint arXiv:2303.15486 (2023).
[61]
Yuxuan Zhang, Lei Liu, and Li Liu. 2023. Cuing without sharing: A federated cued speech recognition framework via mutual knowledge distillation. In Proceedings of the 31st ACM International Conference on Multimedia. 8781--8789.
[62]
Wenbo Zheng, Lan Yan, Chao Gou, and Fei-Yue Wang. 2021. Federated metalearning for fraudulent credit card detection. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 4654--4660.
[63]
Fengtao Zhou and Hao Chen. 2023. Cross-modal translation and alignment for survival analysis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 21485--21494.
[64]
Juncen Zhu, Jiannong Cao, Divya Saxena, Shan Jiang, and Houda Ferradi. 2023. Blockchain-empowered federated learning: Challenges, solutions, and future directions. Comput. Surveys 55, 11 (2023), 1--31.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
October 2024
11719 pages
ISBN:9798400706868
DOI:10.1145/3664647
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. meta learning
  2. multimodal federated learning

Qualifiers

  • Research-article

Funding Sources

  • NSFC
  • Tianjin Natural Science Foundation

Conference

MM '24
Sponsor:
MM '24: The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 84
    Total Downloads
  • Downloads (Last 12 months)84
  • Downloads (Last 6 weeks)68
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media