This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Analyzing Key Features of Open Source Software Survivability with Random Forest
by
Sohee Park
Sohee Park † and
Gihwon Kwon
Gihwon Kwon
Gihwon Kwon received a Bachelor’s degree from Kyonggi University in 1985 and Master and Ph.D. from [...]
Gihwon Kwon received a Bachelor’s degree from Kyonggi University in 1985 and Master and Ph.D. degrees from Chung-Ang University in 1987 and 1991, respectively. He was a Visiting Professor at Carnegie Mellon University from 2006 to 2007. He was the President of the Software Engineering Society, Korean Institute of Information Scientists and Engineers, from 2014 to 2016. He is currently a Professor at the Department of Computer Engineering, Kyonggi University, where he has worked since 1991. He is also the Director of the National Center of Excellence in Software and the Dean of the College of Software and Business Administration at Kyonggi University. He is interested in the field of software engineering and software safety.
*,†
Department of SW Safety and Cyber Security, Kyonggi University, Suwon-si 16227, Gyeonggi-do, Republic of Korea
*
Author to whom correspondence should be addressed.
†
These authors contributed equally to this work.
Appl. Sci. 2025, 15(2), 946; https://doi.org/10.3390/app15020946 (registering DOI)
Submission received: 27 November 2024
/
Revised: 28 December 2024
/
Accepted: 6 January 2025
/
Published: 18 January 2025
Abstract
Open source software (OSS) projects rely on voluntary contributions, but their long-term survivability depends on sustained community engagement and effective problem-solving. Survivability, critical for maintaining project quality and trustworthiness, is closely linked to issue activity, as unresolved issues reflect a decline in maintenance capacity and problem-solving ability. Thus, analyzing issue retention rates provides valuable insights into a project’s health. This study evaluates OSS survivability by identifying the features that influence issue activity and analyzing their relationships with survivability. Kaplan–Meier survival analysis is employed to quantify issue activity and visualize trends in unresolved issue rates, providing a measure of project maintenance dynamics. A random forest model is used to examine the relationships between project features—such as popularity metrics, community engagement, code complexity, and project age—and issue retention rates. The results show that stars significantly reduce issue retention rates, with rates dropping from 0.62 to 0.52 as stars increase to 4000, while larger codebases, higher cyclomatic complexity, and older project age are associated with unresolved issue rates, rising by up to 15%. Forks also have a nonlinear impact, initially stabilizing retention rates but increasing unresolved issues as contributions became unmanageable. By identifying these critical factors and quantifying their impacts, this research offers actionable insights for OSS project managers to enhance project survivability and address key maintenance challenges, ensuring sustainable long-term success.
Share and Cite
MDPI and ACS Style
Park, S.; Kwon, G.
Analyzing Key Features of Open Source Software Survivability with Random Forest. Appl. Sci. 2025, 15, 946.
https://doi.org/10.3390/app15020946
AMA Style
Park S, Kwon G.
Analyzing Key Features of Open Source Software Survivability with Random Forest. Applied Sciences. 2025; 15(2):946.
https://doi.org/10.3390/app15020946
Chicago/Turabian Style
Park, Sohee, and Gihwon Kwon.
2025. "Analyzing Key Features of Open Source Software Survivability with Random Forest" Applied Sciences 15, no. 2: 946.
https://doi.org/10.3390/app15020946
APA Style
Park, S., & Kwon, G.
(2025). Analyzing Key Features of Open Source Software Survivability with Random Forest. Applied Sciences, 15(2), 946.
https://doi.org/10.3390/app15020946
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.