Search | arXiv e-print repository

An Effective Networks Intrusion Detection Approach Based on Hybrid Harris Hawks and Multi-Layer Perceptron

Authors: Moutaz Alazab, Ruba Abu Khurma, Pedro A. Castillo, Bilal Abu-Salih, Alejandro Martin, David Camacho

Abstract: This paper proposes an Intrusion Detection System (IDS) employing the Harris Hawks Optimization algorithm (HHO) to optimize Multilayer Perceptron learning by optimizing bias and weight parameters. HHO-MLP aims to select optimal parameters in its learning process to minimize intrusion detection errors in networks. HHO-MLP has been implemented using EvoloPy NN framework, an open-source Python tool s… ▽ More This paper proposes an Intrusion Detection System (IDS) employing the Harris Hawks Optimization algorithm (HHO) to optimize Multilayer Perceptron learning by optimizing bias and weight parameters. HHO-MLP aims to select optimal parameters in its learning process to minimize intrusion detection errors in networks. HHO-MLP has been implemented using EvoloPy NN framework, an open-source Python tool specialized for training MLPs using evolutionary algorithms. For purposes of comparing the HHO model against other evolutionary methodologies currently available, specificity and sensitivity measures, accuracy measures, and mse and rmse measures have been calculated using KDD datasets. Experiments have demonstrated the HHO MLP method is effective at identifying malicious patterns. HHO-MLP has been tested against evolutionary algorithms like Butterfly Optimization Algorithm (BOA), Grasshopper Optimization Algorithms (GOA), and Black Widow Optimizations (BOW), with validation by Random Forest (RF), XG-Boost. HHO-MLP showed superior performance by attaining top scores with accuracy rate of 93.17%, sensitivity level of 89.25%, and specificity percentage of 95.41%. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2207.10654 [pdf]

Emotion detection of social data: APIs comparative study

Authors: Bilal Abu-Salih, Mohammad Alhabashneh, Dengya Zhu, Albara Awajan, Yazan Alshamaileh, Bashar Al-Shboul, Mohammad Alshraideh

Abstract: The development of emotion detection technology has emerged as a highly valuable possibility in the corporate sector due to the nearly limitless uses of this new discipline, particularly with the unceasing propagation of social data. In recent years, the electronic marketplace has witnessed the establishment of a large number of start-up businesses with an almost sole focus on building new commerc… ▽ More The development of emotion detection technology has emerged as a highly valuable possibility in the corporate sector due to the nearly limitless uses of this new discipline, particularly with the unceasing propagation of social data. In recent years, the electronic marketplace has witnessed the establishment of a large number of start-up businesses with an almost sole focus on building new commercial and open-source tools and APIs for emotion detection and recognition. Yet, these tools and APIs must be continuously reviewed and evaluated, and their performances should be reported and discussed. There is a lack of research to empirically compare current emotion detection technologies in terms of the results obtained from each model using the same textual dataset. Also, there is a lack of comparative studies that apply benchmark comparison to social data. This study compares eight technologies; IBM Watson NLU, ParallelDots, Symanto-Ekman, Crystalfeel, Text to Emotion, Senpy, Textprobe, and NLP Cloud. The comparison was undertaken using two different datasets. The emotions from the chosen datasets were then derived using the incorporated APIs. The performance of these APIs was assessed using the aggregated scores that they delivered as well as the theoretically proven evaluation metrics such as the micro-average of accuracy, classification error, precision, recall, and f1-score. Lastly, the assessment of these APIs incorporating the evaluation measures is reported and discussed. △ Less

Submitted 8 July, 2022; originally announced July 2022.

arXiv:2207.03771 [pdf]

Healthcare Knowledge Graph Construction: State-of-the-art, open issues, and opportunities

Authors: Bilal Abu-Salih, Muhammad AL-Qurishi, Mohammed Alweshah, Mohammad AL-Smadi, Reem Alfayez, Heba Saadeh

Abstract: The incorporation of data analytics in the healthcare industry has made significant progress, driven by the demand for efficient and effective big data analytics solutions. Knowledge graphs (KGs) have proven utility in this arena and are rooted in a number of healthcare applications to furnish better data representation and knowledge inference. However, in conjunction with a lack of a representati… ▽ More The incorporation of data analytics in the healthcare industry has made significant progress, driven by the demand for efficient and effective big data analytics solutions. Knowledge graphs (KGs) have proven utility in this arena and are rooted in a number of healthcare applications to furnish better data representation and knowledge inference. However, in conjunction with a lack of a representative KG construction taxonomy, several existing approaches in this designated domain are inadequate and inferior. This paper is the first to provide a comprehensive taxonomy and a bird's eye view of healthcare KG construction. Additionally, a thorough examination of the current state-of-the-art techniques drawn from academic works relevant to various healthcare contexts is carried out. These techniques are critically evaluated in terms of methods used for knowledge extraction, types of the knowledge base and sources, and the incorporated evaluation protocols. Finally, several research findings and existing issues in the literature are reported and discussed, opening horizons for future research in this vibrant area. △ Less

Submitted 8 July, 2022; originally announced July 2022.

arXiv:2201.05203 [pdf]

An Intelligent System for Multi-topic Social Spam Detection in Microblogging

Authors: Bilal Abu-Salih, Dana Al Qudah, Malak Al-Hassan, Seyed Mohssen Ghafari, Tomayess Issa, Ibrahim Aljarah, Amin Beheshti, Sulaiman Alqahtan

Abstract: The communication revolution has perpetually reshaped the means through which people send and receive information. Social media is an important pillar of this revolution and has brought profound changes to various aspects of our lives. However, the open environment and popularity of these platforms inaugurate windows of opportunities for various cyber threats, thus social networks have become a fe… ▽ More The communication revolution has perpetually reshaped the means through which people send and receive information. Social media is an important pillar of this revolution and has brought profound changes to various aspects of our lives. However, the open environment and popularity of these platforms inaugurate windows of opportunities for various cyber threats, thus social networks have become a fertile venue for spammers and other illegitimate users to execute their malicious activities. These activities include phishing hot and trendy topics and posting a wide range of contents in many topics. Hence, it is crucial to continuously introduce new techniques and approaches to detect and stop this category of users. This paper proposes a novel and effective approach to detect social spammers. An investigation into several attributes to measure topic-dependent and topic-independent users' behaviours on Twitter is carried out. The experiments of this study are undertaken on various machine learning classifiers. The performance of these classifiers are compared and their effectiveness is measured via a number of robust evaluation measures. Further, the proposed approach is benchmarked against state-of-the-art social spam and anomalous detection techniques. These experiments report the effectiveness and utility of the proposed approach and embedded modules. △ Less

Submitted 13 January, 2022; originally announced January 2022.

arXiv:2105.03239 [pdf]

doi 10.1007/978-981-33-6652-7_4

Semantic data discovery from Social Big Data

Authors: Bilal Abu-Salih, Pornpit Wongthongtham, Dengya Zhu, Kit Yan Chan, Amit Rudra

Abstract: Due to the large volume of data and information generated by a multitude of social data sources, it is a huge challenge to manage and extract useful knowledge, especially given the different forms of data, streaming data and uncertainty and ambiguity of data. Hence, there are still challenges in this area of BD analytics research to capture, store, process, visualise, query, and manipulate dataset… ▽ More Due to the large volume of data and information generated by a multitude of social data sources, it is a huge challenge to manage and extract useful knowledge, especially given the different forms of data, streaming data and uncertainty and ambiguity of data. Hence, there are still challenges in this area of BD analytics research to capture, store, process, visualise, query, and manipulate datasets to derive meaningful information that is specific to an application's domain. This chapter attempts to address this problem by studying Semantic Analytics and domain knowledge modelling, and to what extent these technologies can be utilised toward better understanding to the social textual contents. In particular, the chapter gives an overview of semantic analysis and domain ontology followed by shedding light on domain knowledge modelling, inference, semantic storage, and publicly available semantic tools and APIs. Also, the theoretical notion of Knowledge Graphs is reported and their interlinking with SBD is discussed. The utility of the semantic analytics is demonstrated and evaluated through a case study on social data in the context of politics domain. △ Less

Submitted 21 April, 2021; originally announced May 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:1801.01624

arXiv:2104.12591 [pdf]

doi 10.1007/978-981-33-6652-7_5

Predictive analytics using Social Big Data and machine learning

Authors: Bilal Abu-Salih, Pornpit Wongthongtham, Dengya Zhu, Kit Yan Chan, Amit Rudra

Abstract: The ever-increase in the quality and quantity of data generated from day-to-day businesses operations in conjunction with the continuously imported related social data have made the traditional statistical approaches inadequate to tackle such data floods. This has dictated researchers to design and develop advance and sophisticated analytics that can be incorporated to gain valuable insights that… ▽ More The ever-increase in the quality and quantity of data generated from day-to-day businesses operations in conjunction with the continuously imported related social data have made the traditional statistical approaches inadequate to tackle such data floods. This has dictated researchers to design and develop advance and sophisticated analytics that can be incorporated to gain valuable insights that benefit the business domain. This chapter sheds the light on core aspects that lay the foundations for social big data analytics. In particular, the significance of predictive analytics in the context of SBD is discussed fortified with presenting a framework for SBD predictive analytics. Then, various predictive analytical algorithms are introduced with their usage in several important application and top-tier tools and APIs. A case study on using predictive analytics to social data is provided supported with experiments to substantiate the significance and utility of predictive analytics. △ Less

Submitted 21 April, 2021; originally announced April 2021.

arXiv:2104.09190 [pdf]

doi 10.1007/978-981-33-6652-7_3

Credibility Analysis in Social Big Data

Authors: Bilal Abu-Salih, Pornpit Wongthongtham, Dengya Zhu, Kit Yan Chan, Amit Rudra

Abstract: The concept of social trust has attracted an attention of information processors/data scientists and information consumers / business firms. One of the main reasons for acquiring the value of SBD is to provide frameworks and methodologies using which the credibility of online social services users can be evaluated. These approaches should be scalable to accommodate large-scale social data. Hence,… ▽ More The concept of social trust has attracted an attention of information processors/data scientists and information consumers / business firms. One of the main reasons for acquiring the value of SBD is to provide frameworks and methodologies using which the credibility of online social services users can be evaluated. These approaches should be scalable to accommodate large-scale social data. Hence, there is a need for well comprehending of social trust to improve and expand the analysis process and inferring credibility of social big data. Given the exposed environment's settings and fewer limitations related to online social services, the medium allows legitimate and genuine users as well as spammers and other low trustworthy users to publish and spread their content. This chapter presents an overview of the notion of credibility in the context of SBD. It also list an array of approaches to measure and evaluate the trustworthiness of users and their contents. Finally, a case study is presented that incorporates semantic analysis and machine learning modules to measure and predict users' trustworthiness in numerous domains in different time periods. The evaluation of the conducted experiment validates the applicability of the incorporated machine learning techniques to predict highly trustworthy domain-based users. △ Less

Submitted 19 April, 2021; originally announced April 2021.

arXiv:2104.08062 [pdf]

doi 10.1007/978-981-33-6652-7_2

Introduction to Big data Technology

Authors: Bilal Abu-Salih, Pornpit Wongthongtham, Dengya Zhu, Kit Yan Chan, Amit Rudra

Abstract: Big data is no more "all just hype" but widely applied in nearly all aspects of our business, governments, and organizations with the technology stack of AI. Its influences are far beyond a simple technique innovation but involves all rears in the world. This chapter will first have historical review of big data; followed by discussion of characteristics of big data, i.e. from the 3V's to up 10V's… ▽ More Big data is no more "all just hype" but widely applied in nearly all aspects of our business, governments, and organizations with the technology stack of AI. Its influences are far beyond a simple technique innovation but involves all rears in the world. This chapter will first have historical review of big data; followed by discussion of characteristics of big data, i.e. from the 3V's to up 10V's of big data. The chapter then introduces technology stacks for an or-ganization to build a big data application, from infrastruc-ture/platform/ecosystem to constructional units and components. Finally, we provide some big data online resources for reference. △ Less

Submitted 15 April, 2021; originally announced April 2021.

arXiv:2104.03904 [pdf]

doi 10.1007/978-981-33-6652-7_1

Social Big Data: An Overview and Applications

Authors: Bilal Abu-Salih, Pornpit Wongthongtham, Dengya Zhu, Kit Yan Chan, Amit Rudra

Abstract: The emergence of online social media services has made a qualitative leap and brought profound changes to various aspects of human, cultural, intellectual, and social life. These significant Big data tributaries have further transformed the businesses processes by establishing convergent and transparent dialogues between businesses and their customers. Therefore, analysing the flow of social data… ▽ More The emergence of online social media services has made a qualitative leap and brought profound changes to various aspects of human, cultural, intellectual, and social life. These significant Big data tributaries have further transformed the businesses processes by establishing convergent and transparent dialogues between businesses and their customers. Therefore, analysing the flow of social data content is necessary in order to enhance business practices, to augment brand awareness, to develop insights on target markets, to detect and identify positive and negative customer sentiments, etc., thereby achieving the hoped-for added value. This chapter presents an overview of Social Big Data term and definition. This chapter also lays the foundation for several applications and analytics that are broadly discussed in this book. △ Less

Submitted 1 April, 2021; originally announced April 2021.

arXiv:2011.00235 [pdf]

Domain-specific Knowledge Graphs: A survey

Authors: Bilal Abu-Salih

Abstract: Knowledge Graphs (KGs) have made a qualitative leap and effected a real revolution in knowledge representation. This is leveraged by the underlying structure of the KG which underpins a better comprehension, reasoning and interpretation of knowledge for both human and machine. Therefore, KGs continue to be used as the main means of tackling a plethora of real-life problems in various domains. Howe… ▽ More Knowledge Graphs (KGs) have made a qualitative leap and effected a real revolution in knowledge representation. This is leveraged by the underlying structure of the KG which underpins a better comprehension, reasoning and interpretation of knowledge for both human and machine. Therefore, KGs continue to be used as the main means of tackling a plethora of real-life problems in various domains. However, there is no consensus in regard to a plausible and inclusive definition of a domain-specific KG. Further, in conjunction with several limitations and deficiencies, various domain-specific KG construction approaches are far from perfect. This survey is the first to offer a comprehensive definition of a domain-specific KG. Also, the paper presents a thorough review of the state-of-the-art approaches drawn from academic works relevant to seven domains of knowledge. An examination of current approaches reveals a range of limitations and deficiencies. At the same time, uncharted territories on the research map are highlighted to tackle extant issues in the literature and point to directions for future research. △ Less

Submitted 3 March, 2021; v1 submitted 31 October, 2020; originally announced November 2020.

arXiv:2006.01626 [pdf]

Relational Learning Analysis of Social Politics using Knowledge Graph Embedding

Authors: Bilal Abu-Salih, Marwan Al-Tawil, Ibrahim Aljarah, Hossam Faris, Pornpit Wongthongtham

Abstract: Knowledge Graphs (KGs) have gained considerable attention recently from both academia and industry. In fact, incorporating graph technology and the copious of various graph datasets have led the research community to build sophisticated graph analytics tools. Therefore, the application of KGs has extended to tackle a plethora of real-life problems in dissimilar domains. Despite the abundance of th… ▽ More Knowledge Graphs (KGs) have gained considerable attention recently from both academia and industry. In fact, incorporating graph technology and the copious of various graph datasets have led the research community to build sophisticated graph analytics tools. Therefore, the application of KGs has extended to tackle a plethora of real-life problems in dissimilar domains. Despite the abundance of the currently proliferated generic KGs, there is a vital need to construct domain-specific KGs. Further, quality and credibility should be assimilated in the process of constructing and augmenting KGs, particularly those propagated from mixed-quality resources such as social media data. This paper presents a novel credibility domain-based KG Embedding framework. This framework involves capturing a fusion of data obtained from heterogeneous resources into a formal KG representation depicted by a domain ontology. The proposed approach makes use of various knowledge-based repositories to enrich the semantics of the textual contents, thereby facilitating the interoperability of information. The proposed framework also embodies a credibility module to ensure data quality and trustworthiness. The constructed KG is then embedded in a low-dimension semantically-continuous space using several embedding techniques. The utility of the constructed KG and its embeddings is demonstrated and substantiated on link prediction, clustering, and visualisation tasks. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:2001.07838 [pdf]

An Approach for Time-aware Domain-based Social Influence Prediction

Authors: Bilal Abu-Salih, Kit Yan Chan, Omar Al-Kadi, Marwan Al-Tawil, Pornpit Wongthongtham, Tomayess Issa, Heba Saadeh, Malak Al-Hassan, Bushra Bremie, Abdulaziz Albahlal

Abstract: Online Social Networks(OSNs) have established virtual platforms enabling people to express their opinions, interests and thoughts in a variety of contexts and domains, allowing legitimate users as well as spammers and other untrustworthy users to publish and spread their content. Hence, the concept of social trust has attracted the attention of information processors/data scientists and informatio… ▽ More Online Social Networks(OSNs) have established virtual platforms enabling people to express their opinions, interests and thoughts in a variety of contexts and domains, allowing legitimate users as well as spammers and other untrustworthy users to publish and spread their content. Hence, the concept of social trust has attracted the attention of information processors/data scientists and information consumers/business firms. One of the main reasons for acquiring the value of Social Big Data (SBD) is to provide frameworks and methodologies using which the credibility of OSNs users can be evaluated. These approaches should be scalable to accommodate large-scale social data. Hence, there is a need for well comprehending of social trust to improve and expand the analysis process and inferring the credibility of SBD. Given the exposed environment's settings and fewer limitations related to OSNs, the medium allows legitimate and genuine users as well as spammers and other low trustworthy users to publish and spread their content. Hence, this paper presents an approach incorporates semantic analysis and machine learning modules to measure and predict users' trustworthiness in numerous domains in different time periods. The evaluation of the conducted experiment validates the applicability of the incorporated machine learning techniques to predict highly trustworthy domain-based users. △ Less

Submitted 19 January, 2020; originally announced January 2020.

arXiv:1909.03733 [pdf]

Toward a Knowledge-based Personalised Recommender System for Mobile App Development

Authors: Bilal Abu-Salih, Hamad Alsawalqah, Basima Elshqeirat, Tomayess Issa, Pornpit Wongthongtham

Abstract: Over the last few years, the arena of mobile application development has expanded considerably beyond the balance of the worldś software markets. With the growing number of mobile software companies, and the mounting sophistication of smartphones\' technology, developers have been building several categories of applications on dissimilar platforms. However, developers confront several challenges t… ▽ More Over the last few years, the arena of mobile application development has expanded considerably beyond the balance of the worldś software markets. With the growing number of mobile software companies, and the mounting sophistication of smartphones\' technology, developers have been building several categories of applications on dissimilar platforms. However, developers confront several challenges through the implementation of mobile application projects. In particular, there is a lack of consolidated systems that are competent to provide developers with personalised services promptly and efficiently. Hence, it is essential to develop tailored systems which can recommend appropriate tools, IDEs, platforms, software components and other correlated artifacts to mobile application developers. This paper proposes a new recommender system framework comprising a fortified set of techniques that are designed to provide mobile app developers with a distinctive platform to browse and search for the personalised artifacts. The proposed system make use of ontology and semantic web technology as well as machine learning techniques. In particular, the new RS framework comprises the following components; (i) domain knowledge inference module: including various semantic web technologies and lightweight ontologies; (ii) profiling and preferencing: a new proposed time-aware multidimensional user modelling; (iii) query expansion: to improve and enhance the retrieved results by semantically augmenting users\' query; and (iv) recommendation and information filtration: to make use of the aforementioned components to provide personalised services to the designated users and to answer a userś query with the minimum mismatches. △ Less

Submitted 25 December, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

arXiv:1907.11934 [pdf]

Unlocking Social Media and User Generated Content as a Data Source for Knowledge Management

Authors: James Meneghello, Nik Thompson, Kevin Lee, Kok Wai Wong, Bilal Abu-Salih

Abstract: The pervasiveness of Social Media and user-generated content has triggered an exponential increase in global data volumes. However, due to collection and extraction challenges, data in many feeds, embedded comments, reviews and testimonials are inaccessible as a generic data source. This paper incorporates Knowledge Management framework as a paradigm for knowledge management and data value extract… ▽ More The pervasiveness of Social Media and user-generated content has triggered an exponential increase in global data volumes. However, due to collection and extraction challenges, data in many feeds, embedded comments, reviews and testimonials are inaccessible as a generic data source. This paper incorporates Knowledge Management framework as a paradigm for knowledge management and data value extraction. This framework embodies solutions to unlock the potential of UGC as a rich, real-time data source for analytical applications. The contributions described in this paper are threefold. Firstly, a method for automatically navigating pagination systems to expose UGC for collection is presented. This is evaluated using browser emulation integrated with dynamic data collection. Secondly, a new method for collecting social data without any a priori knowledge of the sites is introduced. Finally, a new testbed is developed to reflect the current state of internet sites and shared publicly to encourage future research. The discussion benchmarks the new algorithm alongside existing data extraction techniques and provides evidence of the increased amount of UGC data made accessible by the new algorithm. △ Less

Submitted 2 September, 2019; v1 submitted 27 July, 2019; originally announced July 2019.

arXiv:1902.10402 [pdf]

Social Credibility Incorporating Semantic Analysis and Machine Learning: A Survey of the State-of-the-Art and Future Research Directions

Authors: Bilal Abu-Salih, Bushra Bremie, Pornpit Wongthongtham, Kevin Duan, Tomayess Issa, Kit Yan Chan, Mohammad Alhabashneh, Teshreen Albtoush, Sulaiman Alqahtani, Abdullah Alqahtani, Muteeb Alahmari, Naser Alshareef, Abdulaziz Albahlal

Abstract: The wealth of Social Big Data (SBD) represents a unique opportunity for organisations to obtain the excessive use of such data abundance to increase their revenues. Hence, there is an imperative need to capture, load, store, process, analyse, transform, interpret, and visualise such manifold social datasets to develop meaningful insights that are specific to an application domain. This paper lays… ▽ More The wealth of Social Big Data (SBD) represents a unique opportunity for organisations to obtain the excessive use of such data abundance to increase their revenues. Hence, there is an imperative need to capture, load, store, process, analyse, transform, interpret, and visualise such manifold social datasets to develop meaningful insights that are specific to an application domain. This paper lays the theoretical background by introducing the state-of-the-art literature review of the research topic. This is associated with a critical evaluation of the current approaches, and fortified with certain recommendations indicated to bridge the research gap. △ Less

Submitted 27 February, 2019; originally announced February 2019.

arXiv:1808.01413 [pdf]

CredSaT: Credibility Ranking of Users in Big Social Data incorporating Semantic Analysis and Temporal Factor

Authors: Bilal Abu-Salih, P. Wongthongtham, KY Chan, Z. Dengya

Abstract: The widespread use of big social data has pointed the research community in several significant directions. In particular, the notion of social trust has attracted a great deal of attention from information processors | computer scientists and information consumers | formal organizations. This is evident in various applications such as recommendation systems, viral marketing and expertise retrieva… ▽ More The widespread use of big social data has pointed the research community in several significant directions. In particular, the notion of social trust has attracted a great deal of attention from information processors | computer scientists and information consumers | formal organizations. This is evident in various applications such as recommendation systems, viral marketing and expertise retrieval. Hence, it is essential to have frameworks that can temporally measure users credibility in all domains categorised under big social data. This paper presents CredSaT (Credibility incorporating Semantic analysis and Temporal factor): a fine-grained users credibility analysis framework for big social data. A novel metric that includes both new and current features, as well as the temporal factor, is harnessed to establish the credibility ranking of users. Experiments on real-world dataset demonstrate the effectiveness and applicability of our model to indicate highly domain-based trustworthy users. Further, CredSaT shows the capacity in capturing spammers and other anomalous users. △ Less

Submitted 3 August, 2018; originally announced August 2018.

arXiv:1801.03627 [pdf]

Applying Vector Space Model (VSM) Techniques in Information Retrieval for Arabic Language

Authors: Bilal Abu-Salih

Abstract: Information Retrieval (IR) allows the storage, management, processing and retrieval of information, documents, websites, etc. Building an IR system for any language is imperative. This is evident through the massive conducted efforts to build IR systems using any of its models that are valid for certain languages. This report presents an implementation for a core IR technique which is Vector Space… ▽ More Information Retrieval (IR) allows the storage, management, processing and retrieval of information, documents, websites, etc. Building an IR system for any language is imperative. This is evident through the massive conducted efforts to build IR systems using any of its models that are valid for certain languages. This report presents an implementation for a core IR technique which is Vector Space Model (VSM). We have chosen VSM model for our project since it is a term weighting scheme, and the retrieved documents could be sorted according to their relevancy degree. One other significant feature for such technique is the ability to get a relevance feedback from the users of the system; users can judge whether the retrieved document is relative to their need or not. The developed system has been validated through building an Arabic IR website using server side scripting. The experiments verifies the effectiveness of our system to apply all techniques of vector space model and valid over Arabic language. △ Less

Submitted 14 January, 2018; v1 submitted 10 January, 2018; originally announced January 2018.

arXiv:1801.01624 [pdf]

Ontology-based Approach for Identifying the Credibility Domain in Social Big Data

Authors: Pornpit Wongthontham, Bilal Abu-Salih

Abstract: A challenge of managing and extracting useful knowledge from social media data sources has attracted much attention from academic and industry. To address this challenge, semantic analysis of textual data is focused in this paper. We propose an ontology-based approach to extract semantics of textual data and define the domain of data. In other words, we semantically analyse the social data at two… ▽ More A challenge of managing and extracting useful knowledge from social media data sources has attracted much attention from academic and industry. To address this challenge, semantic analysis of textual data is focused in this paper. We propose an ontology-based approach to extract semantics of textual data and define the domain of data. In other words, we semantically analyse the social data at two levels i.e. the entity level and the domain level. We have chosen Twitter as a social channel challenge for a purpose of concept proof. Domain knowledge is captured in ontologies which are then used to enrich the semantics of tweets provided with specific semantic conceptual representation of entities that appear in the tweets. Case studies are used to demonstrate this approach. We experiment and evaluate our proposed approach with a public dataset collected from Twitter and from the politics domain. The ontology-based approach leverages entity extraction and concept mappings in terms of quantity and accuracy of concept identification. △ Less

Submitted 6 July, 2018; v1 submitted 4 January, 2018; originally announced January 2018.

Showing 1–18 of 18 results for author: Abu-Salih, B