Enhanced Bug Priority Prediction via Priority-Sensitive Long Short-Term Memory–Attention Mechanism

Yang, Geunseok; Ji, Jinfeng; Kim, Jaehee

doi:10.3390/app15020633

Open AccessArticle

Enhanced Bug Priority Prediction via Priority-Sensitive Long Short-Term Memory–Attention Mechanism

by

Geunseok Yang

^1,*,

Jinfeng Ji

² and

Jaehee Kim

³

¹

Department of Computer Applied Mathematics (Computer System Institute), Hankyong National University, Anseong-si 17579, Republic of Korea

²

Department of Computer Applied Mathematics, Hankyong National University, Anseong-si 17579, Republic of Korea

³

Department of Computer Engineering, Kyungnam University, Changwon-si 51767, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(2), 633; https://doi.org/10.3390/app15020633

Submission received: 13 December 2024 / Revised: 1 January 2025 / Accepted: 9 January 2025 / Published: 10 January 2025

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

The rapid expansion of software applications has led to an increase in the frequency of bugs, which are typically reported through user-submitted bug reports. Developers prioritize these reports based on severity and project schedules. However, the manual process of assigning bug priorities is time-consuming and prone to inconsistencies. To address these limitations, this study presents a Priority-Sensitive LSTM–Attention mechanism for automating bug priority prediction. The proposed approach extracts features such as product and component details from bug repositories and preprocesses the data to ensure consistency. Priority-based feature selection is applied to align the input data with the task of bug prioritization. These features are processed through a Long Short-Term Memory (LSTM) network to capture sequential dependencies, and the outputs are further refined using an Attention mechanism to focus on the most relevant information for prediction. The effectiveness of the proposed model was evaluated using datasets from the Eclipse and Mozilla open-source projects. Compared to baseline models such as Naïve Bayes, Random Forest, Decision Tree, SVM, CNN, LSTM, and CNN-LSTM, the proposed model achieved a superior performance. It recorded an accuracy of 93.00% for Eclipse and 84.11% for Mozilla, representing improvements of 31.11% and 40.39%, respectively, over the baseline models. Statistical verification confirmed that these performance gains were significant. This study distinguishes itself by integrating priority-based feature selection with a hybrid LSTM–Attention architecture, which enhances prediction accuracy and robustness compared to existing methods. The results demonstrate the potential of this approach to streamline bug prioritization, improve project management efficiency, and assist developers in resolving high-priority issues.

Keywords:

bug priority prediction; LSTM–attention mechanism; priority-based feature selection; software maintenance; automated bug prioritization

1. Introduction

As the adoption and complexity of software systems increase, the frequency of bugs and feature improvement requests grows correspondingly. For instance, the Eclipse and Mozilla projects reportedly handle approximately 300 bug reports daily [1]. Managing these reports efficiently is critical for maintaining project schedules and software quality.

In typical open-source software development workflows [2,3], users submit bug reports detailing issues and often provide an initial severity rating. Automating the prioritization of these reports offers significant advantages in industrial applications. Automated systems reduce the resource demands associated with manual review and eliminate inconsistencies arising from subjective judgment. By ensuring a more objective and consistent prioritization process, these systems enable project managers to allocate resources efficiently, accelerate response times, and enhance overall project reliability. Additionally, automation can effectively scale up systems to handle large volumes of bug reports, making it particularly valuable for managing complex and rapidly evolving software projects.

Several studies have explored automated approaches for bug priority prediction. Zhou et al. [4] proposed a model that utilizes convolutional neural networks (CNNs) and graph neural networks (GNNs) to analyze the characteristics and interrelationships of source code files, investigating the impact of various sampling methods on bug prioritization. Shatnawi et al. [5] explored the use of traditional machine learning algorithms, enhancing performance through oversampling and feature selection techniques. Similarly, Pasikanti et al. [6] presented a bug prioritization method that employed a combination of algorithms to achieve improved results.

While these approaches represent progress, the performance of bug priority prediction remains limited, necessitating further advancements in both model design and feature extraction methods.

To address these limitations, this study introduces a novel bug priority prediction framework that integrates a Long Short-Term Memory–Attention (LSTM–Attention) mechanism [7] with a priority-based feature selection algorithm [8]. The methodology involves the following steps:

Feature Extraction: Relevant features, including product and component details, are extracted from bug reports in the repository. Additional priority-specific attributes are identified to enhance predictive accuracy.
Preprocessing and Model Training: The extracted features are preprocessed and input into the LSTM–Attention model, which combines temporal data analysis capabilities with an Attention mechanism to focus on the most relevant features for priority prediction.
Priority Prediction: The trained model predicts the priority of each bug report, providing developers and managers with automated recommendations.

The proposed method was evaluated using datasets from two prominent open-source projects: Eclipse and Mozilla. Its performance was benchmarked against several baseline models, including Naive Bayes (NB), Random Forest, Decision Tree, SVM, CNN, LSTM, and CNN-LSTM. Standard evaluation metrics such as Precision, Recall, F-measure, and Accuracy were used for comparison.

The results show that the proposed model achieved a prediction accuracy of 93.00% for Eclipse and 84.11% for Mozilla, representing improvements of approximately 31.11% and 40.39% over the baseline models, respectively. Statistical verification [9,10] confirmed the significance of these improvements.

The contributions of this study are as follows:

Innovative Algorithm Design: A priority-sensitive LSTM–Attention mechanism was developed and combined with priority-based feature extraction, resulting in enhanced bug priority prediction accuracy.
Robust Comparative Analysis: Extensive evaluations against diverse baseline models using datasets from Eclipse and Mozilla highlighted the superior performance of the proposed approach, supported by statistical significance.
Practical Implications: By automating bug priority prediction, the proposed method can enhance developer productivity and enable project managers to allocate resources more effectively, ultimately improving the efficiency of the software development process.

This study shows the effectiveness of a priority-sensitive LSTM–Attention mechanism for automated bug priority prediction. The proposed model significantly outperforms existing methods, providing a reliable and scalable solution for managing the increasing volume of bug reports in large-scale software projects.

2. Background Knowledge

In a typical open-source project, when a user identifies a bug, its content and severity are characterized based on the judgment of the bug reporter. When a project manager receives a bug report, he or she passively reads it and assigns it to the appropriate developer. The developer reads the bug report and sets the appropriate priorities. In this process, there may be errors in the developer’s priority selection process, and if automated priority prediction is possible, it can help developers to predict the priority of the bug. The priority of a bug [2,3] in an open-source project is usually selected from P1 (highest) to P5 (lowest). Bug priority is the prioritization of bugs based on their requirements and severity, indicating the importance or urgency of a bug. Based on these priorities, the developer proceeds with problem solving. If automated priority prediction is possible, the project manager can manage the project more efficiently, and the developer can help predict the priority of bugs. Therefore, efficient software maintenance is possible.

2.1. Bug Report

An example of an Android bug report [11] is shown in Figure 1.

This bug report was submitted on 25 July 2022, and has now been resolved. The bug reporter was ‘ab…@gmail.com’ and the developer was ‘el…@google.com’. The priority of the bug report was P2, and the severity was selected as S2. Therefore, the priority of this bug report was high and it needed to be resolved quickly.

2.2. Bug Priority

The priority of a bug is set according to its urgency, that is, its importance. Typically, bug priority in open-source projects has five levels: immediate (P1), high (P2), intermediate (P3), low (P4), and none (P5). P1 blocks the progress of the entire project and must be fixed first. In general, programs or functions cannot be used because the entire function is blocked, and work cannot be performed. P2 is a bug or problem that should be addressed before the distribution of the software occurs. It is a situation where fatal problems such as data loss and severe memory defects occur and includes cases in which the program bug prevents normal functional operation or requires a new code. P3 is a problem that must be addressed after a serious bug has been fixed. Unexpected functional problems may occur or affect the performance of a function itself. P4 is a low-grade priority and does not require immediate attention. This is a level that causes a minimal loss of function or simple problems, including making suggested improvements to existing functions or implementing small functions to improve quality. P5 can be solved in the future, causing minor problems that are UI-like and not directly related to the function itself, such as spelling errors and poor text coordination.

3. Our Approach

In this study, we propose a method to predict the priority of bugs by integrating priority-based feature selection with an LSTM–Attention mechanism [7]. This approach aims to improve the accuracy and efficiency of bug prioritization. An overview of the process is illustrated in Figure 2.

Preprocessing: Bug reports from the bug repository are preprocessed to ensure consistency and remove irrelevant or noisy data, following established preprocessing techniques [12]. This step standardizes the input data for the subsequent stages of the workflow.
Feature Extraction: Features relevant to each bug report, such as the product and its components, are extracted from the preprocessed data. This extraction process is guided by the priority of the bug, ensuring that features most indicative of the priority level are selected.
Priority-Based Feature Selection: The extracted features undergo a priority-based selection process [8], where attributes particularly relevant to distinguishing between priority levels are retained. This step ensures that the model focuses on the most informative features for priority prediction.
LSTM and Attention Mechanism: The selected features are input into a Long Short-Term Memory (LSTM) network, which captures temporal and sequential patterns in the data. The output of the LSTM network is then passed through an Attention mechanism, which identifies and emphasizes the most critical features for predicting the priority level of the bug.

By combining LSTM’s capability to analyze sequential dependencies with the Attention mechanism’s ability to focus on relevant features, this approach enhances the accuracy of bug priority predictions. This methodology not only reduces the reliance on manual prioritization but also provides a scalable and consistent solution for managing bug reports in large-scale software projects.

3.1. Preprocessing

A bug report is expressed in the form of text about the bugs. In this study, we use a text preprocessing technique [12] for bug reports in model learning. This process includes tokenization, stopword removal, and lemmatization.

First, we divide the text of the bug report into spaces and tokenize them using words. Then, from the tokenized words, we used Stopword removal to eliminate any disused words, such as “we”, “and”, “be”, “should”, “it”, and “very”. Finally, we used lemmatization to extract the roots of the words.

We eliminated “lets” and “whats” from the example sentence “lets focus window 1st process keys pass whats consumed key bindings lets focus window 1st process keys pass whats consumed” using the Stopword removal process. The process of extracting the original form of a word using lemmatization is as follows: “focus”, “window”, “1st”, “process”, “key”, “pass”, “consume”, and “binding”.

3.2. Feature Selection Algorithm

The feature selection algorithm [8] is a ranking method based on the importance of the input variables in supervised learning with goal variables. The algorithm is illustrated in Figure 3.

This module for classification and prediction undergoes a three-step process. First, it eliminates unnecessary variables, such as predictors of low importance or missing values. It ranks the remaining variables in the order of their importance and selects some of the necessary variables from among the sequenced predictors through the modeling process. This study used a method of selecting a set of input variables from a set and then selecting a set of variables in the order in which the dependent variable were ranked, from highest to lowest. First, each variable was scored using a chi-square test. We used the given score to eliminate the less relevant sets of variables and selected those with a higher rank. Second, we ranked them based on the scores given to them and used a subset with high scores while maintaining a subset of existing features. Finally, after combining the feature subsets with the highest ranks, we extracted the features of each priority in the bug report.

3.3. LSTM–Attention Algorithm

The proposed approach integrates Long Short-Term Memory (LSTM) networks [7] and Attention mechanisms [7] to enhance the prediction of bug priorities. Each component is utilized for its specific strengths, addressing the limitations of traditional sequence processing models.

LSTM networks [7] are a specialized type of recurrent neural network (RNN) capable of learning long- and short-term dependencies in sequential data. They are particularly effective in overcoming the vanishing gradient problem often encountered in standard RNNs, making them suitable for tasks that require retaining information over extended sequences.

An LSTM cell operates through the following mechanisms:

Forget Gate: This gate determines which information from the previous step should be discarded. Using a sigmoid activation function, it outputs values between 0 and 1, where 0 represents complete forgetting and 1 indicates full retention of the information.
Input Gate: This gate decides which new information should be added to the cell state. It utilizes a sigmoid layer to determine the update values and a tanh layer to generate candidate values for integration.
Cell State Update: The cell state is updated by combining the retained information from the forget gate with the new information from the input gate. This allows the model to dynamically preserve or overwrite information over time.
Output Gate: The output gate determines the information to pass forward. The updated cell state is processed through a tanh activation function and multiplied by the sigmoid output, producing the final output of the cell.

These mechanisms allow LSTMs to maintain both short-term and long-term information efficiently, making them highly effective for sequential data modeling.

Attention mechanisms [7] are designed to enhance sequence-to-sequence (Seq2Seq) models, particularly in tasks involving encoder–decoder architectures. Traditional Seq2Seq models rely on a fixed-length context vector produced by the encoder to summarize the entire input sequence. However, this approach can result in information loss, especially for long input sequences or when there is a mismatch between input and output lengths.

The Attention mechanism addresses these limitations by allowing the model to dynamically focus on specific parts of the input sequence during the generation of each output element. Instead of treating all input elements equally, the Attention mechanism assigns varying weights to different elements based on their relevance to the current output step.

Key steps in the Attention mechanism include:

Context Vector Calculation: A context vector is computed by taking a weighted sum of the encoder’s hidden states, with the weights determined by an Attention score.
Dynamic Focus: The attention score for each input element is calculated using a compatibility function, which measures the relevance of the input element to the current decoding step. This ensures that the model emphasizes the most relevant portions of the input sequence.
Output Generation: The context vector, combined with the decoder’s hidden state, guides the generation of the current output, enabling the model to adaptively utilize information from the input sequence.

In this paper, we propose a learning method that uses the LSTM–Attention algorithm [7], as shown in Figure 4.

(1)

Encoder–Decoder Framework

The input sequence is processed word-by-word by the encoder LSTM, which captures both short-term and long-term dependencies.
A context vector is generated using the encoder’s hidden states and attention weights.

(2)

Attention-Enhanced Decoding

The decoder LSTM generates the output sequence using the context vector and its own hidden states.
The attention mechanism dynamically assigns weights to different parts of the input sequence, ensuring that the decoder focuses on the most relevant information at each step.

The integration of LSTM and Attention effectively addresses key challenges in Seq2Seq models by combining LSTM’s ability to capture both long-term and short-term dependencies in sequential data with Attention’s dynamic focus on relevant input elements, thereby enhancing the predictive accuracy and robustness of bug priority predictions in complex or large datasets.

4. Experimental Result

Bug reports were extracted based on the priorities of the products and components in the bug repository. We then extract the bug reports’ features based on their respective priorities. We input the extracted features as LSTM inputs. The output results were used as the input for the Attention algorithm to predict the priority of the bug report. In Figure 5, for the LSTM–Attention model, the hyperparameters were set to embedding = 300; filter = 100; layer = 1 or 5; learning rate = 0.00001; and dropout = 0.1 or 0.2.

4.1. Dataset

We used the Eclipse [2] and Mozilla [3] open-source projects to predict the priorities of bug reports. The priority criteria for the project were classified into five categories: P1–P5. The number of bug reports used was 18,810 for Eclipse and 35,057 for Mozilla. A total of 53,867 data points were used. The corresponding bug reports were from 10 October 2001 to 5 May 2011 (Eclipse), 24 April 1998, and 31 December 2012 (Mozilla).

4.2. Evaluation Metrics

To evaluate the efficiency of the proposed model, we used the following formula [13,14]; this formula is a commonly used evaluation measure in machine learning.

P r i_A c c = \frac{P r i_T P + P r i_T N}{P r i_T P + P r i_F N + P r i_F P + P r i_T N}

(1)

P r i_P r e = \frac{P r i_T P}{P r i_T P + P r i_F P}

(2)

P r i_R e c = \frac{P r i_T P}{P r i_T P + P r i_F N}

(3)

P r i_F = \frac{2 \times P r i_{P r e} \times P r i_R e c}{P r i_{P r e} + P r i + R e c}

(4)

Equation (1) evaluates the performance of the proposed model, where Pri_TP refers to accurately predicting the actual priority and Pri_TN means accurately predicting the wrong answer. Pri_FN means that the proposed model incorrectly predicted the correct answer as the wrong answer. Pri_TP means that the proposed model incorrectly predicted the wrong answer as the correct answer.

Equation (2) determines whether the priority predicted by the proposed model is identical to the actual priority. Equation (3) represents the ratio of correct answers to the priorities predicted by the model that proposes the actual priorities. Equation (4) represents the harmonic average of the accuracy and reproduction rates.

4.3. Baseline

In this study, the performances of the models for bug report priority prediction were compared with the following baseline models:

Naïve Bayes (NB) [15]: NB is a conditional probability-based classification method that calculates the probability that a feature of the data belongs to each class (label). This simplifies the probability calculations based on the assumption that the data features are mutually independent. The model determines the probability of belonging to a feature class relative to a class-wide probability distribution.

Decision Tree [16]: A Decision Tree model classifies and predicts datasets using rules in the data. The model determines the target through questions in a series of filtering processes.

Random Forest [16]: This creates a Decision Tree and randomly selects and repeats only certain features that form multiple Decision Trees. Subsequently, the most frequently generated value is selected as the final predicted value based on the predicted values from several Decision Trees.

Support Vector Machine (SVM) [15]: This model converts the data into higher dimensions using nonlinear mapping. This method determines a linear separation that optimally separates the hyperplanes in a new dimension and determines the optimal decision boundary.

CNN [4]: A CNN extracts data features through a convolutional layer. This creates a feature map with the extracted features. Max pooling is performed by inputting this into the pooling layer. Through this process, the largest value can be extracted from the feature map. Finally, data classification is performed based on the characteristics of the networks connected through a fully connected (FC) layer.

LSTM [7]: The LSTM comprises a structure in which the internal layers exchange information. It is also a Long-Term/Short-Term Memory model that predicts results by considering past data.

CNN-LSTM [17]: CNN-LSTM uses the results of the CNN as the LSTM layer input. The LSTM layer performs repetitive operations on the input data. This model learns the relationship between time points in the data after receiving feedback from the previous data.

4.4. Research Questions

The experimental progress of this study was based on the following research questions.

RQ1: Does the proposed model predict bug priorities well?

We evaluated the performance of the proposed model before proceeding with a comparison between the proposed model and the baseline models. If the model predicted the priority of the bug reports, we proceeded with comparisons with the baseline models.

RQ2: Is the proposed model applicable to bug report priority prediction?

If the proposed model performs better than the baseline models, it can be used for priority prediction of bug reports. As such, we verified whether there were statistically significant differences [9,10] between the baseline models and predictive performance evaluation.

4.5. Results

4.5.1. Result of Our Approach

The proposed Priority-Sensitive LSTM–Attention mechanism was evaluated using datasets from the Eclipse and Mozilla open-source projects to assess its predictive performance for bug report priorities. The results are presented and analyzed based on various configurations and parameters to show the effectiveness of the model.

The performance of the model across both datasets is summarized in Figure 6, where the x-axis represents priority levels and the y-axis denotes prediction accuracy.

On average, the model achieved an accuracy of 84.11% for Eclipse and 93.00% for Mozilla. These results indicate that the proposed model is capable of accurately predicting the priority of bug reports, reflecting its potential for practical application in software development workflows.

Further analysis was conducted to evaluate the impact of varying the number of LSTM layers on model performance, as shown in Figure 7.

For the Eclipse dataset, the best performance, with an accuracy of 84.39%, was achieved using a single layer. Conversely, for the Mozilla dataset, the model exhibited its highest performance with five layers. These results suggest that the optimal configuration of LSTM layers may depend on the characteristics of the dataset, such as its size and complexity. Consequently, the model was configured with one layer for Eclipse and five layers for Mozilla to maximize the predictive accuracy of the model for each dataset.

The effect of dropout rates on model performance was also examined to address overfitting concerns, as illustrated in Figure 8 and Figure 9.

For the Eclipse dataset, the optimal dropout rate was determined to be 0.2, resulting in a Precision of 93.50%, Recall of 88.18%, and F-measure of 90.14%. The best accuracy achieved was 92.17%, representing an improvement of 2.03% compared to the average accuracy. For the Mozilla dataset, the optimal dropout rate was found to be 0.1, yielding a Precision of 79.30%, Recall of 76.86%, and F-measure of 77.72%. The highest accuracy for Mozilla was 79.54%, which reflected a 1.82% improvement over the average accuracy. These findings highlight the importance of fine-tuning dropout rates to achieve a balance between model generalization and predictive performance.

The study also investigated the impact of priority-based feature selection on the model’s predictive capabilities. Figure 10 compares the performance of the proposed model with and without the application of feature selection.

Across both datasets, the proposed model incorporating priority-based feature selection consistently outperformed the model without it. This result underscores the significance of extracting and utilizing priority-specific features in enhancing the model’s ability to capture relevant patterns in the data.

We note that the proposed Priority-Sensitive LSTM–Attention mechanism shows a robust performance in predicting bug report priorities. By optimizing hyperparameters, such as the number of layers and dropout rates, and integrating priority-based feature selection, the model achieves a high predictive accuracy and effectively generalizes across diverse datasets.

4.5.2. Comparison Results

We compared the proposed model to other models for bug report priority prediction in Figure 11 and Figure 12 for Eclipse and Mozilla, respectively.

Figure 11 shows a comparison between the proposed Priority-Sensitive LSTM–Attention model and various baseline models for predicting bug report priorities in the Eclipse dataset. The x-axis represents the accuracy percentages, while the y-axis lists the models. The proposed model achieved the highest accuracy of 84.11%, significantly outperforming all the baseline models.

Among the baseline models, CNN-LSTM showed the second-best performance with an accuracy of 72.57%, followed by LSTM at 56.59%. Traditional machine learning models such as Support Vector Machine (SVM), Decision Tree, Random Forest, and Naïve Bayes exhibited similar performance levels, with accuracies ranging between 49.99% and 50.20%. The CNN model recorded the lowest performance among the deep learning approaches, with an accuracy of 41.36%.

This comparison highlights the strength of the proposed model in handling the Eclipse dataset. Its superior performance can be attributed to the integration of priority-sensitive feature selection and the LSTM–Attention mechanism, which allows it to effectively capture sequential patterns and prioritize relevant features for bug report classification.

Figure 12 presents the performance comparison for the Mozilla dataset, with the x-axis representing accuracy percentages and the y-axis listing the models. The proposed model achieved the highest accuracy of 93.00%, further solidifying its effectiveness across diverse datasets.

Similar to the Eclipse dataset, CNN-LSTM was the second-best model with an accuracy of 71.65%, followed by LSTM at 51.84%. Traditional machine learning models such as SVM, Decision Tree, Random Forest, and Naïve Bayes showed comparable performances, with accuracies clustered around 50.04% to 50.26%. The CNN model again recorded the lowest performance among the deep learning models, with an accuracy of 44.37%.

The results for Mozilla underscore the robustness of the proposed LSTM–Attention model for handling large and complex datasets. Its ability to outperform both advanced deep learning models (e.g., CNN-LSTM) and traditional machine learning models shows the effectiveness of its priority-sensitive feature extraction and attention-based learning framework.

The results in Figure 11 and Figure 12 provide compelling evidence of the proposed model’s superiority over baseline models across both the Eclipse and Mozilla datasets. The integration of LSTM and Attention mechanisms enables the proposed model to effectively capture sequential dependencies and focus on relevant features, resulting in significantly higher accuracy. Traditional machine learning models, while consistent in their performance, lag behind deep learning-based approaches due to their inability to leverage temporal and contextual information effectively. Among deep learning baselines, CNN-LSTM shows a competitive performance but falls short of the proposed model due to the absence of a priority-sensitive feature extraction mechanism.

In Figure 13, Figure 14, Figure 15 and Figure 16, the Precision, Recall, and F-measure are compared along with accuracy.

Figure 13 shows a performance comparison between traditional machine learning (ML) baseline models and the proposed Priority-Sensitive LSTM–Attention mechanism for the Eclipse dataset. The evaluation metrics include Precision, Recall, and F-measure, which are plotted along the x-axis for the different models (Naïve Bayes, Random Forest, Decision Tree, SVM, and the proposed model), while the y-axis represents the corresponding performance values.

The proposed model significantly outperformed all the baseline models across all three metrics. The Precision of the proposed model reached 92.17%, compared to the highest baseline Precision of 49.12% achieved by Random Forest. Similarly, the Recall and F-measure of the proposed model were also substantially higher, reflecting its ability to correctly identify relevant bug priorities with fewer false positives and false negatives. In contrast, the baseline models showed relatively similar and lower performance, with SVM achieving a Recall of 39.84% and Decision Tree yielding an F-measure of 39.54%.

This stark difference highlights the limitations of traditional ML approaches in handling the complex relationships and temporal dependencies within bug report datasets. The proposed model’s superior performance can be attributed to its integration of LSTM and Attention mechanisms, which effectively capture sequential patterns and prioritize relevant features in the data.

Figure 14 illustrates a comparison between deep learning models, including CNN, LSTM, CNN-LSTM, and the proposed Priority-Sensitive LSTM–Attention mechanism on the Eclipse dataset. The proposed model achieved the highest performance, with a Precision of 92.17%, Recall of 92.17%, and F-measure of 92.17%. This clearly outperformed all baseline models.

Among the baselines, CNN showed the lowest performance, with a Precision of 15.03%, Recall of 15.03%, and F-measure of 15.03%, indicating its inability to effectively process sequential dependencies. LSTM improved upon this, achieving a Precision of 34.05%, but it still struggled to capture complex patterns in the data. CNN-LSTM, which combines feature extraction and sequential modeling, performed better than the standalone CNN and LSTM models, reaching a Precision of 64.73%, yet it fell significantly short of the proposed model. The results confirm the effectiveness of the proposed model in leveraging priority-sensitive features and attention mechanisms to deliver superior predictive performance.

Figure 15 compares the performance of machine learning models, including Naïve Bayes, Random Forest, Decision Tree, and SVM, with the proposed model on the Mozilla dataset. The proposed model showed clear superiority, achieving a Precision of 79.54%, Recall of 79.54%, and F-measure of 79.54%.

The traditional machine learning models, such as Naïve Bayes and Random Forest, exhibited limited performances, with Precision values of 44.42% and 49.44%, respectively. Decision Tree and SVM showed even lower Recall and F-measure values, with Decision Tree reaching only 34.86% in Recall. The proposed model’s significantly higher scores highlight its ability to handle large-scale and complex data with enhanced Accuracy, Precision, and Recall, reinforcing its robustness in bug priority prediction tasks.

Figure 16 presents the performance comparison for the Mozilla dataset between CNN, LSTM, CNN-LSTM, and the proposed model. Similar to the results for Eclipse, the proposed model achieved the highest Precision, Recall, and F-measure values, all at 79.54%.

CNN, as the weakest baseline model, showed a Precision of 22.73%, Recall of 22.73%, and F-measure of 22.73%, reflecting its limitations in handling the complexity of bug report data. LSTM improved upon these values, reaching a Precision of 30.58%. CNN-LSTM showed a better performance, with a Precision of 64.72%, but it still fell short compared to the proposed model. The results indicate that the proposed model’s combination of LSTM and Attention mechanisms, along with priority-sensitive feature selection, provides a significant advantage over traditional deep learning approaches.

The null hypothesis for statistical verification is as follows.

For Eclipse (H1₀, H2₀, H3₀, H4₀, H5₀, H6₀, H7₀), there is no significant difference between the proposed model and the baseline models (Naïve Bayes, Decision Tree, Random Forest, SVM, CNN, LSTM, and CNN-LSTM).
For Mozilla (H8₀, H9₀, H10₀, H11₀, H12₀, H13₀, H14₀), there is no significant difference between the proposed model and the same baseline models.

The alternative hypotheses to the null hypothesis are as follows.

H1a, H2a, H3a, H4a, H5a, H6a, and H7a: The proposed model and Naïve Bayes, Decision Tree, Random Forest, SVM, CNN, LSTM, and CNN-LSTM differed for Eclipse.
H8a, H9a, H10a, H11a, H12a, H13a, and H14a: The proposed model and Naïve Bayes, Decision Tree, Random Forest, SVM, CNN, LSTM, and CNN-LSTM differed for Mozilla.

The evaluation utilized the F-measure metric to compute the normal distribution [18]. Depending on the results, either a t-test or a Wilcoxon test was performed to determine statistical significance. If the p-value was greater than or equal to 0.05, the t-test was applied. For p-values less than 0.05, the Wilcoxon test was used [9,10]. The results of these tests are summarized in Table 1.

For H1₀, the null hypothesis posited no significant difference between the proposed model and Naïve Bayes for Eclipse. However, the p-value of 0.001953 obtained from the Wilcoxon test is less than 0.05, leading to the rejection of the null hypothesis and the acceptance of the alternative hypothesis. This indicates a statistically significant difference in performance between the proposed model and Naïve Bayes.

Similarly, for other hypotheses, the statistical verification results showed significant differences between the proposed model and all baseline methods across both datasets. The t-tests and Wilcoxon tests consistently produced p-values well below the 0.05 threshold, reinforcing the reliability of the observed improvements in the proposed model’s performance.

The statistical analysis conclusively shows that the proposed Priority-Sensitive LSTM–Attention mechanism outperforms traditional machine learning and deep learning baselines in predicting bug report priorities for both the Eclipse and Mozilla datasets. The results validate the robustness and effectiveness of the proposed model, with statistically significant differences confirmed across all comparisons.

5. Discussion

5.1. Results

In this study, we compared bug report priority prediction performances between the proposed and baseline models (NB, Random Forest, Decision Tree, SVM, CNN, LSTM, and CNN-LSTM). Overall, the proposed model exhibits excellent accuracy.

In addition, statistical verification showed that there was a significant difference between the baseline and the proposed models. The experimental results are analyzed below.

In this study, we applied feature selection algorithms based on the priority of bug reports and predicted the priority using LSTM–Attention model learning for the extracted features.

The experimental results in Figure 6 show approximately 50.42% P3 priority in the Eclipse project and approximately 67.16% P5 priority in the Mozilla project. An analysis of this result shows that an imbalance in the amount of data was found and the smallest data distribution was shown. In the future, the performance of the model can be improved through methods such as data normalization.

The proposed model was compared with a model that did not apply a feature selection algorithm based on priority, and it was confirmed that the applied model performed better.

The Accuracy, Precision, Recall, and F-measure of the evaluation scale showed consistently high performances, with statistically significant differences confirmed by the consistent rejection of the null hypotheses across all comparisons (p-values ranging from 0.001953 to 1.015 × 10⁻¹¹), highlighting the reliability of the proposed Priority-Sensitive LSTM–Attention mechanism and its significant improvements over traditional machine learning and deep learning approaches.

5.2. Threats and Validity

Internal Validity: The datasets used in this study were derived exclusively from the Eclipse and Mozilla open-source projects. While these datasets are widely recognized and provide a diverse range of bug reports, they may not fully represent the characteristics and complexities of other open-source or enterprise-level software projects. The reliance on specific datasets introduces the possibility that the observed performance improvements may not be generalizable to other contexts.
External Validity: The study’s findings may face limitations in applicability when they are extended to other domains or projects with different data structures or reporting formats. Many open-source projects have unique ways of structuring and prioritizing bug reports, and proprietary or business-oriented software projects often follow distinct processes for bug tracking and resolution. Additional verification on datasets from other open-source and industrial projects is necessary to confirm the generalizability and robustness of the proposed model.
Construct Validity: The study assumes that priority-based feature selection effectively captures the most relevant aspects of bug reports for priority prediction. However, this assumption may vary depending on the quality and consistency of the bug reports in different projects. Variability in how bug reports are written, such as the inconsistent use of terminology or incomplete data, could impact the model’s ability to generalize across different datasets.
Data Imbalance: An imbalance in the distribution of bug report priorities was observed in both the Eclipse and Mozilla datasets, particularly for certain priority levels such as P3 and P5. This imbalance could bias the model’s predictions and limit its ability to accurately represent underrepresented categories. Addressing this issue through techniques such as oversampling, data augmentation, or normalization could enhance the model’s performance and improve its robustness across diverse datasets.

6. Related Work

Several studies have addressed the challenge of predicting bug report priorities using a variety of techniques, including machine learning (ML), deep learning (DL), and hybrid approaches. The following section highlights key contributions from previous research:

Rathnayake et al. [19] proposed a CNN-based model for priority prediction. They applied natural language processing (NLP) techniques to preprocess bug report text and extract features before predicting priorities using a convolutional neural network (CNN). Similarly, Umer et al. [20] utilized a text–emotion-analysis-based priority prediction method, combining NLP and ML algorithms to achieve accurate predictions.

Choudhary et al. [21] developed a priority prediction technique incorporating textual features, time, similar reports, authors, severity, and product attributes from Eclipse datasets. Yu et al. [22] introduced a bug-priority prediction system based on artificial neural networks (ANN) across five products, showing an improved Precision, Recall, and F-measure through three cross-validation tests.

Kanwal et al. [23] employed an SVM-based classification approach to predict bug priorities for Eclipse, while Sharma et al. [15] used SVM, Naïve Bayes (NB), and k-nearest neighbors (KNN) for the same task. Their results showed that SVM and KNN achieved over 70% Accuracy, outperforming NB. Alenezi et al. [16] compared NB, decision trees, and random forest algorithms, finding that random forest and decision trees performed better than NB for bug priority prediction.

Tian et al. [24] proposed a regression-based approach to predict bug report priorities, assigning ordinal values to reduce the discrepancy between priority levels. Bani-Salamehe et al. [25] evaluated a five-layer RNN-LSTM model against SVM and KNN on the JIRA dataset, achieving a superior Accuracy (90%) and a 15.2% improvement in the F-measure over KNN.

Kumari et al. [26] applied entropy-based metrics to NB and DL for predicting bug priorities in OpenOffice, showing that entropy-enhanced DL models outperformed their NB counterparts. Pushpalatha et al. [27] used NB, simple logistic regression, and Random Trees for classification, with logistic regression yielding the best results.

Ahmed et al. [28] addressed class imbalances using SMOTE and developed a framework, CaPBUG, which achieved a 90.43% accuracy using NB, Random Forest, Decision Tree, and logistic regression. Fang et al. [29] utilized heterogeneous text graphs and graph convolutional networks (GCNs) to extract word-level meanings from bug reports across multiple datasets, achieving strong results with a weighted loss function.

Malhotra et al. [30] tested multiple ML algorithms, including NB, Decision Trees, logistic regression, Random Forest, and AdaBoost, on open-source projects such as Hadoop and Spark. They showed consistent performance with polynomial NB and cross-validation techniques. Zhang et al. [31] used over 82,000 bug reports to train deep neural networks after feature extraction and preprocessing with NLP.

Umer et al. [32] combined CNNs with emotion analysis, improving F1 scores by over 24% through vectorized text representations and domain-specific emotional features. Huang et al. [33] incorporated developer-centered social and technological features to predict bug priorities, achieving significant improvements in AUC-ROC and MCC scores. Wang et al. [34] improved Precision and Recall using SVM and NB models enhanced with Pearson correlation-based feature selection and information gain.

Pecorelli et al. [35] presented four class-level odor prioritization approaches, showing that their method was, on average, 20% more accurate than the baseline. Zhou et al. [36,37,38] proposed a series of models addressing diverse challenges in IoT sensor networks. First, they introduced a heterogeneous data access model designed to manage vehicle data in diverse IoT sensor monitoring networks, aiming to enhance data access efficiency and enable real-time processing of streaming vehicle data. Second, they developed a heterogeneous data access metamodel for remote sensing observation management in Precision agriculture, integrating IoT and remote sensing technologies to streamline data management and facilitate precision agriculture. Finally, they proposed a metadata model for air quality monitoring, leveraging a heterogeneous Key Performance Indicator (KPI) framework. This model utilizes continuously collected air quality data from IoT sensors to provide robust and efficient air quality detection and analysis.

While previous studies have made significant advancements in bug priority prediction, the proposed Priority-Sensitive LSTM–Attention mechanism in this study offers unique contributions that differentiate it from existing research:

Integration of Priority-Based Feature Selection: Many prior works focus solely on textual or emotional features extracted from bug reports. This study incorporates priority-specific feature selection, emphasizing product and component attributes that are often overlooked. By tailoring the feature extraction process to priority levels, the proposed model provides a more focused and relevant input to the prediction framework.
Hybrid LSTM-Attention Architecture: Unlike models that rely solely on CNNs, RNNs, or traditional ML algorithms, the proposed approach combines LSTM’s capability to capture sequential dependencies with an attention mechanism that dynamically focuses on the most relevant features. This hybrid architecture improves interpretability and prediction accuracy.
Statistical Validation: While many studies report performance improvements, this study performs rigorous statistical testing to confirm the significance of the results. This ensures that the observed differences between the proposed model and baselines are not due to random variation.
Comprehensive Evaluation Across Datasets: The proposed model is evaluated on two widely used datasets (Eclipse and Mozilla), showcasing its robustness and generalizability. Many previous studies focus on a single dataset or domain, limiting their applicability.
Addressing Data Imbalances: The study identifies challenges related to imbalanced priority distributions and highlights potential solutions, such as normalization and resampling, for future improvements. Previous works often overlook these data-related limitations.
Comparison with Broader Baselines: The study benchmarks the proposed model against a wide range of traditional and deep learning models, including SVM, NB, CNN, LSTM, and CNN-LSTM, providing a holistic assessment of its effectiveness.

7. Conclusions

This study addressed the challenge of bug priority prediction by proposing a Priority-Sensitive LSTM–Attention mechanism, aimed at automating and improving the accuracy of bug prioritization. Automated prioritization provides a significant advantage to developers by enabling them to manage their workloads efficiently and effectively. Furthermore, it aids project managers in streamlining project schedules and resource allocation by reducing reliance on subjective, manual priority assignment.

The proposed approach integrates priority-based feature selection and a hybrid LSTM–Attention mechanism. Features were extracted based on the product and component attributes of bug reports, emphasizing priority-specific data. These features were then processed using the LSTM algorithm to capture sequential dependencies, with the outputs fed into an Attention mechanism to highlight the most relevant information for predicting bug priorities.

The model’s performance was rigorously evaluated using the Eclipse and Mozilla datasets and benchmarked against several baseline methods, including traditional machine learning and deep learning models. The results showed that the proposed model significantly outperformed the baselines, achieving accuracies of 84.11% for Eclipse and 93.00% for Mozilla. The comparative analysis revealed that the proposed model improved priority prediction accuracy by approximately 31.11% for Eclipse and 40.39% for Mozilla, reflecting a substantial advancement in bug priority prediction.

Additionally, statistical verification confirmed the robustness of the proposed model, with significant differences observed between the proposed method and the baseline models. These findings underscore the reliability and scalability of the model, particularly for large-scale and complex datasets.

While the proposed model showed a strong performance, there are opportunities for further improvement. Future research should focus on validating the model across a broader range of datasets, including other open-source projects and enterprise-level software. Addressing issues such as data imbalance and variability in bug report structures could further enhance the model’s generalizability. Moreover, incorporating advanced data augmentation techniques and exploring alternative architectures could lead to additional performance gains.

In conclusion, the Priority-Sensitive LSTM–Attention mechanism represents a significant step forward in automating bug prioritization. Its ability to deliver accurate and reliable predictions shows its potential as a valuable tool for improving software maintenance processes. Future work will focus on expanding its applicability and ensuring its effectiveness across diverse project environments.

Author Contributions

Software, J.J. and J.K.; Writing—review & editing, G.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Korea National University Development Project (2024) at Hankyong National University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This research was supported by Hankyong National University Korea National University Development Project (2024).

Conflicts of Interest

The authors declare no conflict of interest.

References

Yang, G.; Zhang, T.; Lee, B. Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In Proceedings of the 38th IEEE Annual International Computer Software and Applications Conference, Vasteras, Sweden, 21–25 July 2014; pp. 97–106. [Google Scholar]
Bettenburg, N.; Just, S.; Schröter, A.; Weiß, C.; Premraj, R.; Zimmermann, T. Quality of bug reports in eclipse. In Proceedings of the 2007 OOPSLA Workshop on Eclipse Technology eXchange, Montreal, QC, Canada, 21 October 2007; pp. 21–25. [Google Scholar]
Banerjee, S.; Helmick, J.; Syed, Z.; Cukic, B. Eclipse vs. Mozilla: A comparison of two large-scale open source problem report repositories. In Proceedings of the 2015 IEEE 16th International Symposium on High Assurance Systems Engineering, Daytona Beach Shores, FL, USA, 8–10 January 2015; pp. 263–270. [Google Scholar]
Zhou, C.Y.; Zeng, C.; He, P. An Exploratory Study of Bug Prioritization and Severity Prediction based on Source Code Features. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, Pittsburgh, PA, USA, 1–10 July 2022. [Google Scholar]
Shatnawi, M.Q.; Alazzam, B. An Assessment of Eclipse Bugs’ Priority and Severity Prediction Using Machine Learning. Int. J. Commun. Networks Inf. Secur. 2022, 14, 62–69. [Google Scholar] [CrossRef]
Pasikanti, N.; Kawaf, C. Bugs Prioritization in Software Engineering: A Systematic Literature Review on Techniques and Methods. Bachelor’s Thesis, Linnaeus University, Växjö, Sweden, 2022. [Google Scholar]
Kim, S.; Kang, M. Financial series prediction using Attention LSTM. arXiv 2019, arXiv:1902.10877. [Google Scholar]
Shang, W.; Huang, H.; Zhu, H.; Lin, Y.; Qu, Y.; Wang, Z. A novel feature selection algorithm for text categorization. Expert Syst. Appl. 2007, 33, 1–5. [Google Scholar] [CrossRef]
Gravetter, F.J.; Wallnau, L.B. Introduction to the t statistic. Essent. Stat. Behav. Sci. 2014, 8, 252. [Google Scholar]
Rosner, B.; Glynn, R.J.; Lee, M.L.T. The Wilcoxon signed rank test for paired comparisons of clustered data. Biometrics 2006, 62, 185–192. [Google Scholar] [CrossRef] [PubMed]
Android #240016030. Available online: https://issuetracker.google.com/issues/240016030 (accessed on 1 January 2025).
Kao, A.; Poteet, S.R. Natural Language Processing and Text Mining; Springer: Cham, Switzerland, 2007. [Google Scholar]
Goutte, C.; Gaussier, E. A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. In Advances in Information Retrieval; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3408, pp. 345–359. [Google Scholar]
Zhou, J.; Zhang, H.; Lo, D. Where Should the Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports. In Proceedings of the 2012 34th International Conference on Software Engineering, Zurich, Switzerland, 2–9 June 2012; pp. 14–24. [Google Scholar]
Sharma, M.; Bedi, P.; Chaturvedi, K.K.; Singh, V.B. Predicting the priority of a reported bug using machine learning techniques and cross project validation. In Proceedings of the 2012 12th International Conference on Intelligent Systems Design and Applications (ISDA), Kochi, India, 27–29 November 2012; pp. 539–545. [Google Scholar]
Alenezi, M.; Banitaan, S. Bug reports prioritization: Which features and classifier to use? In Proceedings of the 2013 12th International Conference on Machine Learning and Applications, Miami, FL, USA, 4–7 December 2013; Volume 2, pp. 112–116. [Google Scholar]
Zhang, J.; Li, Y.; Tian, J.; Li, T. LSTM-CNN hybrid model for text classification. In Proceedings of the 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 October 2018; pp. 1675–1680. [Google Scholar]
González-Estrada, E.; Cosmes, W. Shapiro–Wilk test for skew normal distributions based on data transformations. J. Stat. Comput. Simul. 2019, 89, 3258–3272. [Google Scholar] [CrossRef]
Rathnayake, R.M.D.S.; Kumara, B.T.G.S.; Ekanayake, E.M.U.W.J.B. CNN-Based Priority Prediction of Bug Reports. In Proceedings of the 2021 International Conference on Decision Aid Sciences and Application (DASA), Sakheer, Bahrain, 7–8 December 2021; pp. 299–303. [Google Scholar]
Umer, Q.; Liu, H.; Sultan, Y. Emotion based automated priority prediction for bug reports. IEEE Access 2018, 6, 35743–35752. [Google Scholar] [CrossRef]
Choudhary, P.A.; Singh, S. Neural network-based bug priority prediction model using text classification techniques. Adv. Res. Comput. Sci. 2017, 8, 1315–1319. [Google Scholar]
Yu, L.; Tsai, W.T.; Zhao, W.; Wu, F. Predicting defect priority based on neural networks. In Proceedings of the International Conference on Advanced Data Mining and Applications, Chongqing, China, 19–21 November 2010; pp. 356–367. [Google Scholar]
Kanwal, J.; Maqbool, O. Bug prioritization to facilitate bug report triage. J. Comput. Sci. Technol. 2012, 27, 397–412. [Google Scholar] [CrossRef]
Tian, Y.; Lo, D.; Sun, C. Drone: Predicting priority of reported bugs by multi-factor analysis. In Proceedings of the IEEE International Conference on Software Maintenance, Eindhoven, The Netherlands, 22–28 September 2013; pp. 200–209. [Google Scholar]
Bani-Salameh, H.; Sallam, M. A deep-learning-based bug priority prediction using RNN-LSTM neural networks. e-Inform. Softw. Eng. 2021, 15. [Google Scholar]
Kumari, M.; Singh, V.B. An improved classifier based on entropy and deep learning for bug priority prediction. In Proceedings of the IEEE International Conference on Intelligent Systems Design and Applications (ISDA), Vellore, India, 6–8 December 2018; pp. 571–580. [Google Scholar]
Pushpalatha, M.N.; Mrunalini, M.; Bista, S.R. Predicting the priority of bug reports using classification algorithms. Indian J. Comput. Sci. Eng. 2020, 11, 811–818. [Google Scholar]
Ahmed, H.A.; Bawany, N.Z.; Shamsi, J.A. Capbug-a framework for automatic bug categorization and prioritization using nlp and machine learning algorithms. IEEE Access 2021, 9, 50496–50512. [Google Scholar] [CrossRef]
Fang, S.; Tan, Y.S.; Zhang, T.; Xu, Z.; Liu, H. Effective prediction of bug-fixing priority via weighted graph convolutional networks. IEEE Trans. Reliab. 2021, 70, 563–574. [Google Scholar] [CrossRef]
Malhotra, R.; Dabas, A.; Hariharasudhan, A.S.; Pant, M. A study on machine learning applied to software bug priority prediction. In Proceedings of the 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 28–29 January 2021; pp. 965–970. [Google Scholar]
Zhang, W.; Challis, C. Automatic bug priority prediction using DNN based regression. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery; Springer: Berlin/Heidelberg, Germany, 2019; pp. 333–340. [Google Scholar]
Umer, Q.; Liu, H.; Illahi, I. CNN-based automatic prioritization of bug reports. IEEE Trans. Reliab. 2019, 69, 1341–1354. [Google Scholar] [CrossRef]
Huang, Z.; Shao, Z.; Fan, G.; Yu, H.; Yang, K.; Zhou, Z. Bug Report Priority Prediction Using Developer-Oriented Socio-Technical Features. In Proceedings of the 13th Asia-Pacific Symposium on Internetware, Hohhot, China, 11–12 June 2012; pp. 202–211. [Google Scholar]
Wang, Y.; He, T.; Zhang, W.; Fang, C.; Luo, B. Exploring the Influence of Feature Selection Techniques on Bug Report Prioritization. In Proceedings of the 28th International Conference on Software Engineering and Knowledge Engineering, San Francisco, CA, USA, 1–3 July 2016; pp. 179–184. [Google Scholar]
Pecorelli, F.; Palomba, F.; Khomh, F.; De Lucia, A. Developer-driven code smell prioritization. In Proceedings of the 17th International Conference on Mining Software Repositories, Seoul, Republic of Korea, 29–30 June 2020; pp. 220–231. [Google Scholar]
Zhou, L.; He, Q.; Tu, W.; Du, J.; Zhang, S.; Li, Q.; Zhang, X.; Guan, D. A Heterogeneous Streaming Vehicle Data Access Model for Diverse IoT Sensor Monitoring Network Management. IEEE Internet Things J. 2024, 11, 26929–26943. [Google Scholar] [CrossRef]
Zhou, L.; Tu, W.; Wang, C.; Li, Q. A Heterogeneous Access Metamodel for Efficient IoT Remote Sensing Observation Management: Taking Precision Agriculture as an Example. IEEE Internet Things J. 2021, 9, 8616–8632. [Google Scholar] [CrossRef]
Zhou, L.; Li, Q.; Tu, W.; Wang, C. A Heterogeneous Key Performance Indicator Metadata Model for Air Quality Monitoring in Sustainable Cities. Environ. Model. Softw. 2021, 136, 104955. [Google Scholar] [CrossRef]

Figure 1. Example of Android Report (#240016030).

Figure 2. Overview of our approach.

Figure 3. Overview of the feature selection.

Figure 4. Overview of the LSTM–Attention algorithm.

Figure 5. Summary of our model.

Figure 6. Performance of proposed model.

Figure 7. Comparison of performance based on dropout parameter for Eclipse.

Figure 8. Comparison of performance based on dropout parameter for Mozilla.

Figure 9. Performance of the proposed model based on layer parameters.

Figure 10. Comparison of proposed model and non-feature-selection algorithm.

Figure 11. Comparison of LSTM–Attention and baseline models for Eclipse.

Figure 12. Comparison of LSTM–Attention and baseline models for Mozilla.

Figure 13. Comparison of baseline models’ (ML) performances with proposed model for Eclipse.

Figure 14. Comparison of baseline models’ (DL) performance with proposed model for Eclipse.

Figure 15. Comparison of baseline models’ (ML) performance with proposed model for Mozilla.

Figure 16. Comparison of baseline models’ (DL) performance with proposed model for Mozilla.

Table 1. Statistical verification results.

Hypothesis	p-Value	Result
H1₀	(Wilcox test) 0.001953	H1a: Accept
H2₀	(t-test) 4.222 × 10⁻¹⁶	H2a: Accept
H3₀	(t-test) 3.619 × 10⁻¹²	H3a: Accept
H4₀	(Wilcox test) 0.001953	H4a: Accept
H5₀	(t-test) 1.657 × 10⁻¹²	H5a: Accept
H6₀	(t-test) 1.579 × 10⁻¹²	H6a: Accept
H7₀	(t-test) 4.366 × 10⁻¹³	H7a: Accept
H8₀	(t-test) 1.411 × 10⁻⁸	H8a: Accept
H9₀	(t-test) 7.015 × 10⁻¹⁴	H9a: Accept
H10₀	(t-test) 1.015 × 10⁻¹¹	H10a: Accept
H11₀	(t-test) 3.826 × 10⁻¹⁰	H11a: Accept
H12₀	(t-test) 4.66 × 10⁻¹⁰	H12a: Accept
H13₀	(t-test) 4.497 × 10⁻¹²	H13a: Accept
H14₀	(t-test) 2.158 × 10⁻¹⁰	H14a: Accept

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, G.; Ji, J.; Kim, J. Enhanced Bug Priority Prediction via Priority-Sensitive Long Short-Term Memory–Attention Mechanism. Appl. Sci. 2025, 15, 633. https://doi.org/10.3390/app15020633

AMA Style

Yang G, Ji J, Kim J. Enhanced Bug Priority Prediction via Priority-Sensitive Long Short-Term Memory–Attention Mechanism. Applied Sciences. 2025; 15(2):633. https://doi.org/10.3390/app15020633

Chicago/Turabian Style

Yang, Geunseok, Jinfeng Ji, and Jaehee Kim. 2025. "Enhanced Bug Priority Prediction via Priority-Sensitive Long Short-Term Memory–Attention Mechanism" Applied Sciences 15, no. 2: 633. https://doi.org/10.3390/app15020633

APA Style

Yang, G., Ji, J., & Kim, J. (2025). Enhanced Bug Priority Prediction via Priority-Sensitive Long Short-Term Memory–Attention Mechanism. Applied Sciences, 15(2), 633. https://doi.org/10.3390/app15020633

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhanced Bug Priority Prediction via Priority-Sensitive Long Short-Term Memory–Attention Mechanism

Abstract

1. Introduction

2. Background Knowledge

2.1. Bug Report

2.2. Bug Priority

3. Our Approach

3.1. Preprocessing

3.2. Feature Selection Algorithm

3.3. LSTM–Attention Algorithm

4. Experimental Result

4.1. Dataset

4.2. Evaluation Metrics

4.3. Baseline

4.4. Research Questions

4.5. Results

4.5.1. Result of Our Approach

4.5.2. Comparison Results

5. Discussion

5.1. Results

5.2. Threats and Validity

6. Related Work

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI