skip to main content
research-article

Multi-Channel Embedding Convolutional Neural Network Model for Arabic Sentiment Classification

Published: 21 May 2019 Publication History

Abstract

With the advent of social network services, Arabs’ opinions on the web have attracted many researchers in recent years toward detecting and classifying sentiments in Arabic tweets and reviews. However, the impact of word embeddings vectors (WEVs) initialization and dataset balance on Arabic sentiment classification using deep learning has not been thoroughly studied. In this article, a multi-channel embedding convolutional neural network (MCE-CNN) is proposed to improve Arabic sentiment classification by learning sentiment features from different text domains, word, and character n-grams levels. MCE-CNN encodes a combination of different pre-trained word embeddings into the embedding block at each embedding channel and trains these channels in parallel. Besides, a separate feature extraction module implemented in a CNN block is used to extract more relevant sentiment features. These channels and blocks help to start training on high-quality WEVs and fine-tuning them. The performance of MCE-CNN is evaluated on several standard balanced and imbalanced datasets to reflect real-world use cases. Experimental results show that MCE-CNN provides a high classification accuracy and benefits from the second embedding channel on both standard Arabic and dialectal Arabic text, which outperforms state-of-the-art methods.

References

[1]
Nawaf A. Abdulla, Nizar A. Ahmed, Mohammad A. Shehab, and Mahmoud Al-Ayyoub. 2013. Arabic sentiment analysis: Lexicon-based and corpus-based. In Proceedings of the 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT). 1--6.
[2]
Sadam Al-Azani and El-Sayed M. El-Alfy. 2017a. Hybrid deep learning for sentiment polarity determination of Arabic microblogs. In Proceedings of the International Conference on Neural Information Processing. Springer, 491--500.
[3]
Sadam Al-Azani and El-Sayed M. El-Alfy. 2017b. Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short Arabic text. Procedia Comput. Sci. 109 (2017), 359--366.
[4]
Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. 2013. Polyglot: Distributed word representations for multilingual NLP. arXiv preprint arXiv:1307.1662 (2013).
[5]
Ahmad Al-Sallab, Ramy Baly, Hazem Hajj, Khaled Bashir Shaban, Wassim El-Hajj, and Gilbert Badaro. 2017. AROMA: A recursive deep learning model for opinion mining in Arabic as a low resource language. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 16, 4 (2017), 25.
[6]
Ahmad Al Sallab, Hazem Hajj, Gilbert Badaro, Ramy Baly, Wassim El Hajj, and Khaled Bashir Shaban. 2015. Deep learning models for sentiment analysis in Arabic. In Proceedings of the 2nd Workshop on Arabic Natural Language Processing. 9--17.
[7]
Mohammad Al-Smadi, Omar Qawasmeh, Mahmoud Al-Ayyoub, Yaser Jararweh, and Brij Gupta. 2018. Deep recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels’ reviews. J. Comput. Sci. 27 (2018), 386--393.
[8]
Abdulaziz M. Alayba, Vasile Palade, Matthew England, and Rahat Iqbal. 2017. Arabic language sentiment analysis on health services. In Proceedings of the 1st International Workshop on Arabic Script Analysis and Recognition (ASAR). 114--118.
[9]
Khaled Mohammad Alomari, Hatem M. ElSherif, and Khaled Shaalan. 2017. Arabic tweets sentimental analysis using machine learning. In Proceedings of the International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, 602--610.
[10]
A. Aziz Altowayan and Lixin Tao. 2016. Word embeddings for Arabic sentiment analysis. In IEEE International Conference on Big Data. 3820--3825.
[11]
Mohamed Aly and Amir Atiya. 2013. LABR: A large scale Arabic book reviews dataset. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 2. 494--498.
[12]
Tressy Arts, Yonatan Belinkov, Nizar Habash, Adam Kilgarriff, and Vit Suchomel. 2014. arTenTen: Arabic corpus and word sketches. J. King Saud Univ. Comput. Inf. Sci. 26, 4 (2014), 357--371.
[13]
Ramy Baly, Hazem Hajj, Nizar Habash, Khaled Bashir Shaban, and Wassim El-Hajj. 2017. A sentiment treebank and morphologically enriched recursive deep models for effective sentiment analysis in Arabic. ACM Trans. Asian Low-Resour. Lang. Inf. Process. (TALLIP) 16, 4 (2017), 23.
[14]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016).
[15]
Emil St. Chifu, Tiberiu St. Letia, and Viorica R. Chifu. 2015. Unsupervised aspect level sentiment analysis using self-organizing maps. In Proceedings of the 2015 17th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC). 468--475.
[16]
Abdelghani Dahou, Shengwu Xiong, Junwei Zhou, Mohamed Houcine Haddoud, and Pengfei Duan. 2016. Word embeddings and convolutional neural network for Arabic sentiment classification. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. 2418--2427.
[17]
Samhaa R. El-Beltagy, Mona El Kalamawy, and Abu Bakr Soliman. 2017. NileTMRG at SemEval-2017 task 4: Arabic sentiment analysis. arXiv preprint arXiv:1710.08458 (2017).
[18]
Mohammed Elrazzaz, Shady Elbassuoni, Khaled Shaban, and Chadi Helwe. 2017. Methodical evaluation of Arabic word embeddings. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vol. 2. 454--458.
[19]
Hady ElSahar and Samhaa R. El-Beltagy. 2015. Building large Arabic multi-domain resources for sentiment analysis. In Computational Linguistics and Intelligent Text Processing. 23--34.
[20]
Nikos Engonopoulos, Angeliki Lazaridou, Georgios Paliouras, and Konstantinos Chandrinos. 2011. ELS: A word-level method for entity-level sentiment analysis. In Proceedings of the International Conference on Web Intelligence, Mining and Semantics. 1--9.
[21]
Nizar Y. Habash. 2010. Introduction to Arabic natural language processing. Synth. Lect. Hum. Lang. Technol. 3, 1 (2010), 1--187.
[22]
Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012), 1--18.
[23]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014), 1--6.
[24]
Diederik Kingma and Jimmy Ba. 2014. ADAM: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[25]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013), 1--9.
[26]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems. 3111--3119.
[27]
AL-Smadi Mohammad, Omar Qawasmeh, Mahmoud Al-Ayyoub, Yaser Jararweh, and Brij Gupta. 2018. Deep recurrent neural network vs. Dupport vector machine for aspect-based sentiment analysis of Arabic hotels reviews. J. Comput. Sci. 27 (2018), 386--393.
[28]
Saif M. Mohammad, Mohammad Salameh, and Svetlana Kiritchenko. 2016. How translation alters sentiment. J. Artif. Intell. Res. 55 (2016), 95--130.
[29]
Mahmoud Nabil, Mohamed Aly, and Amir Atiya. 2015. ASTD: Arabic sentiment tweets dataset. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2515--2519.
[30]
Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML'10). 807--814.
[31]
Alexis Amid Neme and Eric Laporte. 2013. Pattern-and-root inflectional morphology: The Arabic broken plural. Lang. Sci. 40 (2013), 221--250.
[32]
Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. ACL, 115--124.
[33]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.
[34]
Abdullateef M. Rabab’Ah, Mahmoud Al-Ayyoub, Yaser Jararweh, and Mohammed N. Al-Kabi. 2016. Evaluating sentiStrength for Arabic sentiment analysis. In International Conference on Computer Science and Information Technology. 1--6.
[35]
Abu Bakr Soliman, Kareem Eissa, and Samhaa R. El-Beltagy. 2017. AraVec: A set of Arabic word embedding mdels for use in Arabic NLP. Procedia Comput. Sci. 117 (2017), 256--265.
[36]
Jin Wang, Zhongyuan Wang, Dawei Zhang, and Jun Yan. 2017. Combining knowledge with deep convolutional neural networks for short text classification. In Proceedings of IJCAI, Vol. 350.
[37]
Ainur Yessenalina, Yisong Yue, and Claire Cardie. 2010. Multi-level structured models for document-level sentiment classification. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. ACL, 1046--1056.
[38]
Mohamed A. Zahran, Ahmed Magooda, Ashraf Y. Mahgoub, Hazem Raafat, Mohsen Rashwan, and Amir Atyia. 2015. Word representations in vector space and their applications for Arabic. In Computational Linguistics and Intelligent Text Processing. 430--443.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 18, Issue 4
December 2019
305 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3327969
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 May 2019
Accepted: 01 February 2019
Revised: 01 December 2018
Received: 01 April 2018
Published in TALLIP Volume 18, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Arabic language
  2. Arabic sentiment classification
  3. Arabic word embeddings
  4. convolutional neural network
  5. deep learning
  6. multi-channel
  7. neural language models

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Hubei Provincial Natural Science Foundation of China
  • Key Technical Innovation Project of Hubei Provence of China
  • National Key Research and Development Program of China
  • Defense Industrial Technology Development Program
  • National Natural Science Foundation of China

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media