research-article

Ensemble of Deep Models for Event Recognition

Authors:

Mohamed Lamine Mekhalfi,

Francesco De NataleAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 14, Issue 2

Article No.: 51, Pages 1 - 20

https://doi.org/10.1145/3199668

Published: 01 May 2018 Publication History

Abstract

In this article, we address the problem of recognizing an event from a single related picture. Given the large number of event classes and the limited information contained in a single shot, the problem is known to be particularly hard. To achieve a reliable detection, we propose a combination of multiple classifiers, and we compare three alternative strategies to fuse the results of each classifier, namely: (i) induced order weighted averaging operators, (ii) genetic algorithms, and (iii) particle swarm optimization. Each method is aimed at determining the optimal weights to be assigned to the decision scores yielded by different deep models, according to the relevant optimization strategy. Experimental tests have been performed on three event recognition datasets, evaluating the performance of various deep models, both alone and selectively combined. Experimental results demonstrate that the proposed approach outperforms traditional multiple classifier solutions based on uniform weighting, and outperforms recent state-of-the-art approaches.

References

[1]

Kashif Ahmad, Nicola Conci, Giulia Boato, and Francesco G. B. De Natale. 2016. USED: A large-scale social event detection dataset. In Proceedings of the 7th International Conference on Multimedia Systems. ACM, 50.

Digital Library

[2]

Kashif Ahmad, Nicola Conci, and F. G. B. De Natale. 2018. A saliency-based approach to event recognition. Signal Process.: Image Commun. 60 (2018), 42--51.

[3]

Kashif Ahmad, Francesco De Natale, Giulia Boato, and Andrea Rosani. 2016. A hierarchical approach to event discovery from single images using MIL framework. In Proceedings of the 2016 IEEE Global Conference on Signal and Information Processing (GlobalSIP’16). IEEE, 1223--1227.

[4]

Sheharyar Ahmad, Kashif Ahmad, Nasir Ahmad, and Nicola Conci. Convolutional neural networks for disaster images retrieval. In Proceedings of the MediaEval 2017 Workshop (Sept. 13--15, 2017). Dublin, Ireland.

[5]

Pradeep K. Atrey, M Anwar Hossain, Abdulmotaleb El Saddik, and Mohan S. Kankanhalli. 2010. Multimodal fusion for multimedia analysis: A survey. Multimedia Syst. 16, 6 (2010), 345--379.

Digital Library

[6]

Alec Banks, Jonathan Vincent, and Chukwudi Anyakoha. 2008. A review of particle swarm optimization. Part II: Hybridisation, combinatorial, multicriteria and constrained optimization, and indicative applications. Nat. Comput. 7, 1 (2008), 109--124.

Digital Library

[7]

Yakoub Bazi and Farid Melgani. 2006. Toward an optimal SVM classification system for hyperspectral remote sensing images. IEEE Trans. Geosci. Remote Sens. 44, 11 (2006), 3374--3385.

[8]

Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. 2013. Event recognition in photo collections with a stopwatch hmm. In Proceedings of the IEEE International Conference on Computer Vision. 1193--1200.

Digital Library

[9]

Markus Brenner and Ebroul Izquierdo. 2012. Social event detection and retrieval in collaborative photo collections. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval. ACM, 21.

Digital Library

[10]

Hyeran Byun and Seong-Whan Lee. 2002. Applications of support vector machines for pattern recognition: A survey. Pattern Recognit. Support Vector Mach. (2002), 571--591.

Digital Library

[11]

Rich Caruana, Art Munson, and Alexandru Niculescu-Mizil. 2006. Getting the most out of ensemble selection. In Proceedings of the Sixth International Conference on Data Mining (ICDM’06). IEEE, 828--833.

Digital Library

[12]

Shih-Fu Chang, R. Manmatha, and Tat-Seng Chua. 2005. Combining text and audio-visual features in video indexing. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’05), Vol. 5. IEEE, v--1005.

[13]

Jose M. Chaquet, Enrique J. Carmona, and Antonio Fernández-Caballero. 2013. A survey of video datasets for human action and activity recognition. Comput. Vis. Image Underst. 117, 6 (2013), 633--659.

Digital Library

[14]

Ling Chen and Abhishek Roy. 2009. Event detection from flickr data through wavelet-based spatial analysis. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM, 523--532.

Digital Library

[15]

Minh-Son Dao, Duc-Tien Dang-Nguyen, and Francesco G. B. De Natale. 2014. Robust event discovery from photo collections using signature image bases (SIBs). Multimedia Tools and Applications 70, 1 (2014), 25--53.

Digital Library

[16]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09). IEEE, 248--255.

[17]

Russell C. Eberhart and Yuhui Shi. 1998. Comparison between genetic algorithms and particle swarm optimization. In Proceedings of the International Conference on Evolutionary Programming. Springer, 611--616.

Digital Library

[18]

Sergio Escalera, Junior Fabian, Pablo Pardo, Xavier Baró, Jordi Gonzalez, Hugo J. Escalante, Dusan Misevic, Ulrich Steiner, and Isabelle Guyon. 2015. Chalearn looking at people 2015: Apparent age and cultural event recognition datasets and results. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1--9.

Digital Library

[19]

Claudiu S. Firan, Mihai Georgescu, Wolfgang Nejdl, and Raluca Paiu. 2010. Bringing order to your photos: Event-driven classification of flickr images based on social knowledge. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, 189--198.

Digital Library

[20]

Chuang Gan, Naiyan Wang, Yi Yang, Dit-Yan Yeung, and Alex G. Hauptmann. 2015. Devnet: A deep event network for multimedia event detection and evidence recounting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2568--2577.

[21]

Yue-Jiao Gong and Jun Zhang. 2012. Real-time traffic signal control for roundabouts by using a PSO-based fuzzy controller. In Proceedings of the 2012 IEEE Congress on Evolutionary Computation (CEC’12). IEEE, 1--8.

[22]

Cong Guo and Xinmei Tian. 2015. Event recognition in personal photo collections using hierarchical model and multiple features. In Proceedings of the 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP’15). IEEE, 1--6.

[23]

David L. Hall and James Llinas. 1997. An introduction to multisensor data fusion. Proc. IEEE 85, 1 (1997), 6--23.

[24]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[25]

Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Comput. 18, 7 (2006), 1527--1554.

Digital Library

[26]

Weiming Hu, Nianhua Xie, Li Li, Xianglin Zeng, and Stephen Maybank. 2011. A survey on visual content-based video indexing and retrieval. IEEE Trans. Syst., Man Cybern., Part C (Appl. Revi.) 41, 6 (2011), 797--819.

Digital Library

[27]

Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning. 448--456.

Digital Library

[28]

Giridharan Iyengar, Harriet J Nock, and Chalapathy Neti. 2003. Audio-visual synchrony for detection of monologues in video archives. In Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), Vol. 5. IEEE, V--772.

[29]

Max Jaderberg, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2016. Reading text in the wild with convolutional neural networks. International Journal of Computer Vision 116, 1 (2016), 1--20.

Digital Library

[30]

Alejandro Jaimes and Nicu Sebe. 2007. Multimodal human--computer interaction: A survey. Comput. Vis. Image Underst. 108, 1 (2007), 116--134.

Digital Library

[31]

Yu-Gang Jiang, Subhabrata Bhattacharya, Shih-Fu Chang, and Mubarak Shah. 2013. High-level event recognition in unconstrained videos. Int. J. Multimedia Inform. Retr. 2, 2 (2013), 73--101.

[32]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Adv. Neural Inform. Process. Syst. 1097--1105.

Digital Library

[33]

Zhen-Zhong Lan, Lei Bao, Shoou-I Yu, Wei Liu, and Alexander G. Hauptmann. 2012. Double fusion for multimedia event detection. In Proceedings of the International Conference on MultiMedia Modeling. Springer, 173--185.

Digital Library

[34]

Li-Jia Li and Li Fei-Fei. 2007. What, where and who? Classifying events by scene and object recognition. In Proceedings of the IEEE 11th International Conference on Computer Vision (ICCV’07). IEEE, 1--8.

[35]

Mengyi Liu, Xin Liu, Yan Li, Xilin Chen, Alexander G. Hauptmann, and Shiguang Shan. 2015. Exploiting feature hierarchies with convolutional neural networks for cultural event recognition. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 32--37.

Digital Library

[36]

Xueliang Liu and Benoit Huet. 2013. Heterogeneous features and model selection for event-based media classification. In Proceedings of the 3rd ACM Conference on International Conference on Multimedia Retrieval. ACM, 151--158.

Digital Library

[37]

Kieran McDonald and Alan F. Smeaton. 2005. A comparison of score, rank and probability-based fusion methods for video shot retrieval. In Proceedings of the International Conference on Image and Video Retrieval. Springer, 61--70.

Digital Library

[38]

Vasileios Mezaris, Ansgar Scherp, Ramesh Jain, and Mohan S. Kankanhalli. 2014. Real-life events in multimedia: Detection, representation, retrieval, and applications. Multimedia Tools Appl. 70, 1 (2014), 1--6.

Digital Library

[39]

Milind Naphade, John R. Smith, Jelena Tesic, Shih-Fu Chang, Winston Hsu, Lyndon Kennedy, Alexander Hauptmann, and Jon Curtis. 2006. Large-scale concept ontology for multimedia. IEEE Multimedia 13, 3 (2006), 86--91.

Digital Library

[40]

Pradeep Natarajan, Shuang Wu, Shiv Vitaladevuni, Xiaodan Zhuang, Stavros Tsakalidis, Unsang Park, Rohit Prasad, and Premkumar Natarajan. 2012. Multimodal feature fusion for robust event detection in web videos. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). IEEE, 1298--1305.

Digital Library

[41]

Kaoru Ota, Minh Son Dao, Vasileios Mezaris, and Francesco G. B. De Natale. 2017. Deep learning for mobile multimedia: A survey. ACM Trans. Multimedia Comput. Commun. Appl. 13, 3s (2017), 34.

Digital Library

[42]

Symeon Papadopoulos, Raphael Troncy, Vasileios Mezaris, Benoit Huet, and Ioannis Kompatsiaris. 2011. Social event detection at mediaeval 2011: Challenges, dataset and evaluation. In MediaEval.

[43]

Symeon Papadopoulos, Christos Zigkolis, Yiannis Kompatsiaris, and Athena Vakali. 2011. Cluster-based landmark and event detection for tagged photo collections. IEEE MultiMedia 18, 1 (2011), 52--63.

Digital Library

[44]

Sungheon Park and Nojun Kwak. 2015. Cultural event recognition by subregion classification with convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 45--50.

[45]

Georgios Petkos, Symeon Papadopoulos, Vasileios Mezaris, Raphael Troncy, Philipp Cimiano, Timo Reuter, and Yiannis Kompatsiaris. 2014. Social event detection at mediaeval: A three-year retrospect of tasks and results. In Proc. ACM ICMR 2014 Workshop on Social Events in Web Multimedia (SEWM’14).

[46]

Gerasimos Potamianos, Chalapathy Neti, Guillaume Gravier, Ashutosh Garg, and Andrew W. Senior. 2003. Recent advances in the automatic recognition of audiovisual speech. Proc. IEEE 91, 9 (2003), 1306--1326.

[47]

Reza Fuad Rachmadi, Keiichi Uchimura, and Gou Koutaki. 2016. Combined convolutional neural network for event recognition. In Proceedings of the Korea-Japan Joint Workshop on Frontiers of Computer Vision. 85--90.

[48]

Reza Fuad Rachmadi, Keiichi Uchimura, and Gou Koutaki. 2016. Spatial pyramid convolutional neural network for social event detection in static image. arXiv:1612.04062 (2016).

[49]

Timo Reuter, Symeon Papadopoulos, Giorgos Petkos, Vasileios Mezaris, Yiannis Kompatsiaris, Philipp Cimiano, Christopher de Vries, and Shlomo Geva. 2013. Social event detection at mediaeval 2013: Challenges, datasets, and evaluation. In Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop.

[50]

Andrea Rosani, Giulia Boato, and Francesco G. B. De Natale. 2015. Eventmask: A game-based framework for event-saliency identification in images. IEEE Trans. Multimedia 17, 8 (2015), 1359--1371.

[51]

Amaia Salvador, Matthias Zeppelzauer, Daniel Manchon-Vizuete, Andrea Calafell, and Xavier Giro-i Nieto. 2015. Cultural event recognition with visual ConvNets and temporal models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 36--44.

[52]

Walter J. Scheirer, Lalit P. Jain, and Terrance E. Boult. 2014. Probability models for open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 36, 11 (2014), 2317--2324.

[53]

Luca Scrucca. 2016. Genetic algorithms for subset selection in model-based clustering. In Unsupervised Learning Algorithms. Springer, 55--70.

[54]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014).

[55]

Alan F. Smeaton. 1998. Independence of contributing retrieval strategies in data fusion for effective information retrieval. In BCS-IRSG Annual Colloquium on IR Research.

Digital Library

[56]

Cees G. M. Snoek, Marcel Worring, and Arnold W. M. Smeulders. 2005. Early versus late fusion in semantic video analysis. In Proceedings of the 13th Annual ACM International Conference on Multimedia. ACM, 399--402.

Digital Library

[57]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.

[58]

Raphaël Troncy, Bartosz Malocha, and André T. S. Fialho. 2010. Linking events with media. In Proceedings of the 6th International Conference on Semantic Systems. ACM, 42.

Digital Library

[59]

Christos Tzelepis, Zhigang Ma, Vasileios Mezaris, Bogdan Ionescu, Ioannis Kompatsiaris, Giulia Boato, Nicu Sebe, and Shuicheng Yan. 2016. Event-based media processing and analysis: A survey of the literature. Image Vis. Comput. 53 (2016), 3--19.

Digital Library

[60]

Ellen M. Voorhees, Narendra K. Gupta, and Ben Johnson-Laird. 1995. Learning collection fusion strategies. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 172--179.

Digital Library

[61]

Limin Wang, Zhe Wang, Wenbin Du, and Yu Qiao. 2015. Object-scene convolutional neural networks for event recognition in images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 30--35.

[62]

Limin Wang, Zhe Wang, Sheng Guo, and Yu Qiao. 2015. Better exploiting OS-CNNS for better event recognition in images. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 45--52.

Digital Library

[63]

Limin Wang, Zhe Wang, Yu Qiao, and Luc Van Gool. 2017. Transferring deep object and scene representations for event recognition in still images. Int. J. Comput. Vis. (2017), 1--20.

Digital Library

[64]

Yao Wang, Zhu Liu, and Jin-Cheng Huang. 2000. Multimedia content analysis-using both audio and visual clues. IEEE Signal Process. Mag. 17, 6 (2000), 12--36.

[65]

Yanxiang Wang, Hari Sundaram, and Lexing Xie. 2012. Social event detection with interaction graph modeling. In Proceedings of the 20th ACM International Conference on Multimedia. ACM, 865--868.

Digital Library

[66]

Utz Westermann and Ramesh Jain. 2007. Toward a common event model for multimedia applications. IEEE Multimedia 14, 1 (2007), 19--29.

Digital Library

[67]

Yuanjun Xiong, Kai Zhu, Dahua Lin, and Xiaoou Tang. 2015. Recognize complex events from static images by fusing deep channels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1600--1609.

[68]

Lei Xu, Adam Krzyzak, and Ching Y. Suen. 1992. Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst., Man Cybern. 22, 3 (1992), 418--435.

[69]

Ronald R. Yager and Dimitar P. Filev. 1999. Induced ordered weighted averaging operators. IEEE Trans. Syst. Man Cybern., Part B (Cybern.) 29, 2 (1999), 141--150.

Digital Library

[70]

Wenyi Zhao, Rama Chellappa, P. Jonathon Phillips, and Azriel Rosenfeld. 2003. Face recognition: A literature survey. ACM Comput. Surv. 35, 4 (2003), 399--458.

Digital Library

[71]

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921--2929.

[72]

Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning deep features for scene recognition using places database. In Advances in Neural Information Processing Systems. 487--495.

Digital Library

Cited By

Tahmasebzadeh GSpringstein MEwerth RMüller-Budack E(2024)Few-Shot Event Classification in Images using Knowledge Graphs for Prompting2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00712(7271-7280)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00712
Masood SHussain AKhan MTrehan P(2024)Indian Cultural event detection using GBVS boosted deep convolutional neural networksProcedia Computer Science10.1016/j.procs.2024.04.039235(394-402)Online publication date: 2024
https://doi.org/10.1016/j.procs.2024.04.039
Wang DYang GGuo ZChen J(2024)Improving image steganography security via ensemble steganalysis and adversarial perturbation minimizationJournal of Information Security and Applications10.1016/j.jisa.2024.10383585(103835)Online publication date: Sep-2024
https://doi.org/10.1016/j.jisa.2024.103835
Show More Cited By

Index Terms

Ensemble of Deep Models for Event Recognition
1. Information systems
  1. Information retrieval

Recommendations

Multilayer and Multimodal Fusion of Deep Neural Networks for Video Classification
MM '16: Proceedings of the 24th ACM international conference on Multimedia

This paper presents a novel framework to combine multiple layers and modalities of deep neural networks for video classification. We first propose a multilayer strategy to simultaneously capture a variety of levels of abstraction and invariance in a ...
A hybrid genetically-bacterial foraging algorithm converged by particle swarm optimisation for global optimisation

The social foraging behaviour of Escherichia coli bacteria and the effectiveness of genetic operators have recently been combined to develop a hybridised algorithm for distributed optimisation and control. The classical algorithms have their importance ...
Evolutionary method combining Particle Swarm Optimisation and Genetic Algorithms using fuzzy logic for parameter adaptation and aggregation: the case neural network optimisation for face recognition

We describe in this paper a new hybrid approach for optimisation combining Particle Swarm Optimisation (PSO) and Genetic Algorithms (GAs) using Fuzzy Logic for parameter adaptation and to integrate the results. The new evolutionary method combines the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 14, Issue 2

May 2018

208 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3210458

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2018

Accepted: 01 March 2018

Revised: 01 March 2018

Received: 01 September 2017

Published in TOMM Volume 14, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
303
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tahmasebzadeh GSpringstein MEwerth RMüller-Budack E(2024)Few-Shot Event Classification in Images using Knowledge Graphs for Prompting2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00712(7271-7280)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00712
Masood SHussain AKhan MTrehan P(2024)Indian Cultural event detection using GBVS boosted deep convolutional neural networksProcedia Computer Science10.1016/j.procs.2024.04.039235(394-402)Online publication date: 2024
https://doi.org/10.1016/j.procs.2024.04.039
Wang DYang GGuo ZChen J(2024)Improving image steganography security via ensemble steganalysis and adversarial perturbation minimizationJournal of Information Security and Applications10.1016/j.jisa.2024.10383585(103835)Online publication date: Sep-2024
https://doi.org/10.1016/j.jisa.2024.103835
Zhang JFeng HLiu BZhao D(2023)Survey of Technology in Network Security Situation AwarenessSensors10.3390/s2305260823:5(2608)Online publication date: 27-Feb-2023
https://doi.org/10.3390/s23052608
Zhang JYu YTang SWu JLi W(2023)Variational Autoencoder with CCA for Audio–Visual Cross-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357565819:3s(1-21)Online publication date: 24-Feb-2023
https://dl.acm.org/doi/10.1145/3575658
Wang KDing CPang JXu X(2023)Context Sensing Attention Network for Video-based Person Re-identificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357320319:4(1-20)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3573203
Ma XYang XXu C(2023)Multi-Source Knowledge Reasoning Graph Network for Multi-Modal Commonsense InferenceACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357320119:4(1-17)Online publication date: 15-Mar-2023
https://dl.acm.org/doi/10.1145/3573201
Lei FCao ZYang YDing YZhang C(2023)Learning the User’s Deeper Preferences for Multi-modal Recommendation SystemsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357301019:3s(1-18)Online publication date: 24-Feb-2023
https://dl.acm.org/doi/10.1145/3573010
Wang WLin LFan ZLiu J(2023)Semi-supervised Learning for Mars Imagery Classification and SegmentationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357291619:4(1-23)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3572916
Wang JMou LMa LHuang TGao W(2023)AMSA: Adaptive Multimodal Learning for Sentiment AnalysisACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357291519:3s(1-21)Online publication date: 24-Feb-2023
https://dl.acm.org/doi/10.1145/3572915
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents