skip to main content
10.1145/3581783.3612181acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Few-shot Multimodal Sentiment Analysis Based on Multimodal Probabilistic Fusion Prompts

Published: 27 October 2023 Publication History

Abstract

Multimodal sentiment analysis has gained significant attention due to the proliferation of multimodal content on social media. However, existing studies in this area rely heavily on large-scale supervised data, which is time-consuming and labor-intensive to collect. Thus, there is a need to address the challenge of few-shot multimodal sentiment analysis. To tackle this problem, we propose a novel method called Multimodal Probabilistic Fusion Prompts (MultiPoint) that leverages diverse cues from different modalities for multimodal sentiment detection in the few-shot scenario. Specifically, we start by introducing a Consistently Distributed Sampling approach called CDS, which ensures that the few-shot dataset has the same category distribution as the full dataset. Unlike previous approaches primarily using prompts based on the text modality, we design unified multimodal prompts to reduce discrepancies between different modalities and dynamically incorporate multimodal demonstrations into the context of each multimodal instance. To enhance the model's robustness, we introduce a probabilistic fusion method to fuse output predictions from multiple diverse prompts for each input. Our extensive experiments on six datasets demonstrate the effectiveness of our approach. First, our method outperforms strong baselines in the multimodal few-shot setting. Furthermore, under the same amount of data (1% of the full dataset), our CDS-based experimental results significantly outperform those based on previously sampled datasets constructed from the same number of instances of each class.

References

[1]
Sarah A. Abdu, Ahmed H. Yousef, and Ashraf Salem. 2021. Multimodal Video Sentiment Analysis Using Deep Learning Approaches, a Survey. Inf. Fusion (2021), 204--226. https://doi.org/10.1016/j.inffus.2021.06.003
[2]
Andrew Brock, Soham De, and Samuel L. Smith. 2021. Characterizing signal propagation to close the performance gap in unnormalized ResNets. In ICLR. https://openreview.net/forum?id=IX3Nnir2omJ
[3]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, and et al. 2020. Language Models are Few-Shot Learners. In NeurIPS. https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
[4]
Yi-Ting Chen, Jinghao Shi, Christoph Mertz, Shu Kong, and Deva Ramanan. 2021. Multimodal object detection via bayesian fusion. arXiv preprint arXiv:2104.02904 (2021).
[5]
Hongsheng Dai, Murray Pollock, and Gareth Roberts. 2021. Bayesian Fusion: Scalable unification of distributed statistical analyses. arXiv preprint arXiv:2102.02123 (2021).
[6]
Tianyu Gao, Adam Fisch, and Danqi Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. In ACL/IJCNLP. 3816--3830. https://doi.org/10.18653/v1/2021.acl-long.295
[7]
Ehsan Hosseini-Asl, Wenhao Liu, and Caiming Xiong. 2022. A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis. In Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, WA, United States, July 10-15, 2022, Marine Carpuat, Marie-Catherine de Marneffe, and Ivá n Vladimir Meza Ruíz (Eds.). Association for Computational Linguistics, 770--787. https://doi.org/10.18653/v1/2022.findings-naacl.58
[8]
Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li, and Yiwei Lv. 2019. Open-Domain Targeted Sentiment Analysis via Span-Based Extraction and Classification. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Llu'i s Mà rquez (Eds.). Association for Computational Linguistics, 537--546. https://doi.org/10.18653/v1/p19-1051
[9]
Yiren Jian, Chongyang Gao, and Soroush Vosoughi. 2022. Contrastive Learning for Prompt-based Few-shot Language Learners. In NAACL. 5577--5587. https://doi.org/10.18653/v1/2022.naacl-main.408
[10]
Xincheng Ju, Dong Zhang, Rong Xiao, Junhui Li, Shoushan Li, Min Zhang, and Guodong Zhou. 2021. Joint Multi-modal Aspect-Sentiment Analysis with Auxiliary Cross-modal Relation Detection. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 4395--4405. https://doi.org/10.18653/v1/2021.emnlp-main.360
[11]
Ramandeep Kaur and Sandeep Kautish. 2019. Multimodal Sentiment Analysis: A Survey and Comparison. Int. J. Serv. Sci. Manag. Eng. Technol. (2019), 38--58. https://doi.org/10.4018/IJSSMET.2019040103
[12]
Zaid Khan and Yun Fu. 2021. Exploiting BERT for Multimodal Target Sentiment Classification through Input Space Translation. In MM '21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021, Heng Tao Shen, Yueting Zhuang, John R. Smith, Yang Yang, Pablo Cé sar, Florian Metze, and Balakrishnan Prabhakaran (Eds.). ACM, 3034--3042. https://doi.org/10.1145/3474085.3475692
[13]
Zhen Li, Bing Xu, Conghui Zhu, and Tiejun Zhao. 2022. CLMLF: A Contrastive Learning and Multi-Layer Fusion Method for Multimodal Sentiment Detection. CoRR, Vol. abs/2204.05515 (2022). https://doi.org/10.48550/arXiv.2204.05515 showeprint[arXiv]2204.05515
[14]
Yan Ling, Jianfei Yu, and Rui Xia. 2022. Vision-Language Pre-Training for Multimodal Aspect-Based Sentiment Analysis. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, 2149--2159. https://doi.org/10.18653/v1/2022.acl-long.152
[15]
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. CoRR, Vol. abs/2107.13586 (2021). [arXiv]2107.13586 https://arxiv.org/abs/2107.13586
[16]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, Vol. abs/1907.11692 (2019). showeprint[arXiv]1907.11692 http://arxiv.org/abs/1907.11692
[17]
Di Lu, Leonardo Neves, Vitor Carvalho, Ning Zhang, and Heng Ji. 2018. Visual Attention Model for Name Tagging in Multimodal Social Media. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computational Linguistics, 1990--1999. https://doi.org/10.18653/v1/P18--1185
[18]
Ron Mokady, Amir Hertz, and Amit H. Bermano. 2021. ClipCap: CLIP Prefix for Image Captioning. CoRR, Vol. abs/2111.09734 (2021). [arXiv]2111.09734 https://arxiv.org/abs/2111.09734
[19]
Teng Niu, Shiai Zhu, Lei Pang, and Abdulmotaleb El-Saddik. 2016. Sentiment Analysis on Multi-View Social Data. In MMM. 15--27. https://doi.org/10.1007/978-3-319-27674-8_2
[20]
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. (2018).
[21]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, 3980--3990. https://doi.org/10.18653/v1/D19-1410
[22]
Maria Tsimpoukelli, Jacob Menick, Serkan Cabi, S. M. Ali Eslami, Oriol Vinyals, and Felix Hill. 2021. Multimodal Few-Shot Learning with Frozen Language Models. In NeurIPS. 200--212. https://proceedings.neurips.cc/paper/2021/hash/01b7575c38dac42f3cfb7d500438b875-Abstract.html
[23]
Nan Xu, Wenji Mao, and Guandan Chen. 2018. A Co-Memory Network for Multimodal Sentiment Analysis. In SIGIR. 929--932. https://doi.org/10.1145/3209978.3210093
[24]
Hao Yang, Yanyan Zhao, and Bing Qin. 2022b. Face-Sensitive Image-to-Emotional-Text Cross-modal Translation for Multimodal Aspect-based Sentiment Analysis. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, 3324--3335. https://aclanthology.org/2022.emnlp-main.219
[25]
Li Yang, Jin-Cheon Na, and Jianfei Yu. 2022a. Cross-Modal Multitask Transformer for End-to-End Multimodal Aspect-Based Sentiment Analysis. Inf. Process. Manag., Vol. 59, 5 (2022), 103038. https://doi.org/10.1016/j.ipm.2022.103038
[26]
Xiaocui Yang, Shi Feng, Daling Wang, and Yifei Zhang. 2021a. Image-Text Multimodal Emotion Classification via Multi-View Attentional Network. IEEE Trans. Multim. (2021), 4014--4026. https://doi.org/10.1109/TMM.2020.3035277
[27]
Xiaocui Yang, Shi Feng, Yifei Zhang, and Daling Wang. 2021b. Multimodal Sentiment Detection Based on Multi-channel Graph Neural Networks. In ACL/IJCNLP. 328--339. https://doi.org/10.18653/v1/2021.acl-long.28
[28]
Jianfei Yu, Kai Chen, and Rui Xia. 2022a. Hierarchical interactive multimodal transformer for aspect-based multimodal sentiment analysis. IEEE Transactions on Affective Computing (2022).
[29]
Jianfei Yu and Jing Jiang. 2019. Adapting BERT for Target-Oriented Multimodal Sentiment Classification. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, Sarit Kraus (Ed.). ijcai.org, 5408--5414. https://doi.org/10.24963/ijcai.2019/751
[30]
Yang Yu and Dong Zhang. 2022. Few-Shot Multi-Modal Sentiment Analysis with Prompt-Based Vision-Aware Language Modeling. In ICME. 1--6. https://doi.org/10.1109/ICME52920.2022.9859654
[31]
Yang Yu, Dong Zhang, and Shoushan Li. 2022b. Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning. In MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022, João Magalh a es, Alberto Del Bimbo, Shin'ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, and Laura Toni (Eds.). ACM, 189--198. https://doi.org/10.1145/3503161.3548306
[32]
Qi Zhang, Jinlan Fu, Xiaoyu Liu, and Xuanjing Huang. 2018. Adaptive Co-attention Network for Named Entity Recognition in Tweets. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2019, Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press, 5674--5681. http://www.qizhang.info/paper/aaai2017-twitterner.pdf
[33]
Tianyi Zhang, Felix Wu, Arzoo Katiyar, Kilian Q. Weinberger, and Yoav Artzi. 2021. Revisiting Few-sample BERT Fine-tuning. In ICLR. OpenReview.net. https://openreview.net/forum?id=cO1IH43yUF
[34]
Jie Zhou, Jiabao Zhao, Jimmy Xiangji Huang, Qinmin Vivian Hu, and Liang He. 2021. MASAD: A large-scale dataset for multimodal aspect-based sentiment analysis. Neurocomputing, Vol. 455 (2021), 47--58. https://doi.org/10.1016/j.neucom.2021.05.040
[35]
Haidong Zhu, Zhaoheng Zheng, Mohammad Soleymani, and Ram Nevatia. 2022. Self-Supervised Learning for Sentiment Analysis via Image-Text Matching. In ICASSP. 1710--1714. https://doi.org/10.1109/ICASSP43922.2022.9747819

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
October 2023
9913 pages
ISBN:9798400701085
DOI:10.1145/3581783
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. consistently distributed sampling
  2. multimodal demonstrations
  3. multimodal few-shot
  4. multimodal probabilistic fusion
  5. multimodal sentiment analysis
  6. unified multimodal prompt

Qualifiers

  • Research-article

Funding Sources

Conference

MM '23
Sponsor:
MM '23: The 31st ACM International Conference on Multimedia
October 29 - November 3, 2023
Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)429
  • Downloads (Last 6 weeks)22
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media