Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJune 2024
Mutation‐based data augmentation for software defect prediction
Journal of Software: Evolution and Process (WSMR), Volume 36, Issue 6https://doi.org/10.1002/smr.2634AbstractSoftware defect prediction (SDP) aims to distinguish between defective and nondefective instances, but the imbalance between these two classes often leads to reduced prediction performance. Conventional SDP approaches use oversampling techniques,...
A novel mutation‐based data augmentation method is proposed, in which data are increased at the code level while preserving its semantic features. The method utilizes the mutation operator for generating mutants to mutate against nondefective instances ...
- research-articleJune 2024
A Gaussian–Based WGAN–GP Oversampling Approach for Solving the Class Imbalance Problem
International Journal of Applied Mathematics and Computer Science (IJAMCS), Volume 34, Issue 2Pages 291–307https://doi.org/10.61822/amcs-2024-0021AbstractIn practical applications of machine learning, the class distribution of the collected training set is usually imbalanced, i.e., there is a large difference among the sizes of different classes. The class imbalance problem often hinders the ...
- short-paperMay 2024
Dual Graph Networks with Synthetic Oversampling for Imbalanced Rumor Detection on Social Media
WWW '24: Companion Proceedings of the ACM Web Conference 2024Pages 750–753https://doi.org/10.1145/3589335.3651494Rumor detection is to identify and mitigate potentially damaging falsehoods, thereby shielding the public from misleading information. However, existing methods fall short of tackling class imbalance, meaning rumor is less common than true messages, as ...
- research-articleJanuary 2024
Severity classification of software code smells using machine learning techniques: A comparative study
Journal of Software: Evolution and Process (WSMR), Volume 36, Issue 1https://doi.org/10.1002/smr.2454AbstractCode smell is a software characteristic that indicates bad symptoms in code design which causes problems related to software quality. The severity of code smells must be measured because it will help the developers when determining the priority ...
Proposed framework to detect the severity of code smells depends on several machine learning models. LIME algorithm was further used to explain the machine learning model's predictions and interpretability. Prediction rules have been generated by PART ...
- research-articleJanuary 2024
Advancing automated social engineering detection with oversampling-based machine learning
International Journal of Security and Networks (IJSN), Volume 19, Issue 3Pages 150–158https://doi.org/10.1504/ijsn.2024.141783Social engineering attacks have surged with the increased reliance on online interactions. However, detecting these subtle deceptions remains challenging. This study proposes a novel machine learning approach to enhance social engineering attack ...
-
- research-articleJanuary 2024
Whale Optimization-based Synthetic Minority Oversampling Technique for Binary Imbalanced Datasets
Procedia Computer Science (PROCS), Volume 235, Issue CPages 250–263https://doi.org/10.1016/j.procs.2024.04.027AbstractThe problem of class imbalance has become a predominant area of research recently. Synthetic Minority Oversampling Technique (SMOTE) stands as a popular and widely adopted oversampling technique that effectively addresses the challenge of class ...
- research-articleAugust 2023
DZ-SMS: An Authentic Corpus of Algerian SMS
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), Volume 22, Issue 8Article No.: 217, Pages 1–21https://doi.org/10.1145/3610522In this article, a complete methodology of a corpus realization of authentic Short Message Service (SMS) from Algerian dialect and which are transcribed in Latin characters or symbols is presented. A linguistic material constituted by 6,000 SMS coming ...
- research-articleAugust 2023
Small Data, Big Challenges: Pitfalls and Strategies for Machine Learning in Fatigue Detection
PETRA '23: Proceedings of the 16th International Conference on PErvasive Technologies Related to Assistive EnvironmentsPages 364–373https://doi.org/10.1145/3594806.3594825This research addresses the pitfalls and strategies for machine learning with small data sets in the context of sensor-based fatigue detection. It is shown that many existing studies in this area rely on small data sets and that classification results ...
- ArticleSeptember 2023
SIA-SMOTE: A SMOTE-Based Oversampling Method with Better Interpolation on High-Dimensional Data by Using a Siamese Network
Advances in Computational IntelligencePages 448–460https://doi.org/10.1007/978-3-031-43085-5_35AbstractSMOTE is an effective method for balancing imbalanced datasets by interpolating between existing samples in the minority class. However, if the synthetic samples generated through interpolation are based on noisy data points, then they may also be ...
- research-articleJanuary 2023
Enhanced diversity scheme for orthogonal frequency division multiplexing systems over doubly selective fading channels
IET Communications (CMU2), Volume 17, Issue 6Pages 695–703https://doi.org/10.1049/cmu2.12573AbstractFrequency‐time selective fading degrades the performance of communication systems, but it also provides an opportunity to collect multipath diversity and Doppler diversity. In this paper, an oversampled grouped‐linear‐constellation‐precoding (GLCP)...
Under doubly selective fading channels, the OFDM system can obtain more multipath‐Doppler diversity gains through oversampling and GLCP coding techniques. image image
- research-articleJanuary 2023
A clustered borderline synthetic minority over-sampling technique for balancing quick access recorder data
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology (JIFS), Volume 45, Issue 4Pages 6849–6862https://doi.org/10.3233/JIFS-233548Most of the flight accident data have uneven distribution of categories. When the traditional classifier is applied to this data, it will pay less attention to the minority class data. Synthetic Minority Over-sampling Technique (SMOTE), and its ...
- research-articleJanuary 2023
A novel stacking framework with PSO optimized SVM for effective disease classification
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology (JIFS), Volume 45, Issue 3Pages 4105–4123https://doi.org/10.3233/JIFS-232268Disease diagnosis is very important in the medical field. It is essential to diagnose chronic diseases such as diabetes, heart disease, cancer, and kidney diseases in the early stage. In recent times, ensembled-based approaches giving effective ...
- research-articleJanuary 2023
Evolutionary algorithms based on oversampling techniques for enhancing the imbalanced credit card fraud detection
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology (JIFS), Volume 44, Issue 6Pages 10311–10323https://doi.org/10.3233/JIFS-222344Online services have advanced to the point where they have made our lives much easier, but many problems should be solved to make these services safer for consumers. Numerous transactions are conducted daily, and much personal information is published and ...
- research-articleJanuary 2023
HSNF: Hybrid sampling with two-step noise filtering for imbalanced data classification
Intelligent Data Analysis (INDA), Volume 27, Issue 6Pages 1573–1593https://doi.org/10.3233/IDA-227111Imbalanced data classification has received much attention in machine learning, and many oversampling methods exist to solve this problem. However, these methods may suffer from insufficient noise filtering, overlap between synthetic and original samples,...
- research-articleJanuary 2023
Clustering-based improved adaptive synthetic minority oversampling technique for imbalanced data classification
Intelligent Data Analysis (INDA), Volume 27, Issue 3Pages 635–652https://doi.org/10.3233/IDA-226612Synthetic Minority Oversampling Technique (SMOTE) and some extensions based on it are popularly used to balance imbalanced data. In this study, we concentrate on solving overfitting of the classification model caused by choosing instances to oversample ...
- research-articleJanuary 2023
Anomaly detection and oversampling approach for classifying imbalanced data using CLUBS technique in IoT healthcare data
International Journal of Intelligent Engineering Informatics (IJIEI), Volume 11, Issue 3Pages 255–271https://doi.org/10.1504/ijiei.2023.133074Multiple data streams from sensing devices in intelligent settings have improved life quality thanks to the internet of things (IoT). Anomalies and imbalanced data sources are unavoidable due to system complexity and IoT device rollout issues. An ...
- research-articleJanuary 2023
Intrusion detection system using resampled dataset - a comparative study
International Journal of Ad Hoc and Ubiquitous Computing (IJAHUC), Volume 42, Issue 4Pages 243–257https://doi.org/10.1504/ijahuc.2023.130464Existing machine-learning research aims to improve the predictive capability of datasets using various feature selection and classification models. In the intrusion detection, data consists of normal data and a minimal number of attack data. This data ...
- research-articleJanuary 2023
Improving Batik Pattern Classification using CNN with Advanced Augmentation and Oversampling on Imbalanced Dataset
- Beatrice Josephine Filia,
- Filbert Fernandes Lienardy,
- I Kadek Perry Bagus Laksana,
- Jayasidhi Ariyo Jordan,
- Joyceline Graciella Siento,
- Shilvia Meidhi Honova,
- Silviya Hasana,
- Ivan Halim Permonangan
Procedia Computer Science (PROCS), Volume 227, Issue CPages 508–517https://doi.org/10.1016/j.procs.2023.10.552AbstractIn image classification task, imbalanced dataset is a problem that often occurs. Batik pattern data also suffers this problem, mainly because of the poor quality of available images and rarity of certain patterns. In this research, we employed a ...
- research-articleJanuary 2023
A Factor Based Multiple Imputation Approach to Handle Class Imbalance
Procedia Computer Science (PROCS), Volume 218, Issue CPages 103–112https://doi.org/10.1016/j.procs.2022.12.406AbstractClass imbalance and incompleteness are the two most serious problems faced in data science and machine learning when working on real-life datasets. Both of these cases have severe implications on the ability of classification algorithms to make ...
- research-articleJanuary 2023
Sentiment analysis of Indonesian police chief using multi-level ensemble model
Procedia Computer Science (PROCS), Volume 216, Issue CPages 620–629https://doi.org/10.1016/j.procs.2022.12.177AbstractSentiment analysis is a technique of analyzing text to classify its emotion into positive, negative, or neutral sentiments. The main purpose of this study is to use sentiment analysis to seek Indonesian opinions about the chief of the Indonesian ...