skip to main content
research-article
Public Access

A Robust Machine Learning Technique to Predict Low-performing Students

Published: 16 January 2019 Publication History

Abstract

As enrollments and class sizes in postsecondary institutions have increased, instructors have sought automated and lightweight means to identify students who are at risk of performing poorly in a course. This identification must be performed early enough in the term to allow instructors to assist those students before they fall irreparably behind. This study describes a modeling methodology that predicts student final exam scores in the third week of the term by using the clicker data that is automatically collected for instructors when they employ the Peer Instruction pedagogy. The modeling technique uses a support vector machine binary classifier, trained on one term of a course, to predict outcomes in the subsequent term. We applied this modeling technique to five different courses across the computer science curriculum, taught by three different instructors at two different institutions. Our modeling approach includes a set of strengths not seen wholesale in prior work, while maintaining competitive levels of accuracy with that work. These strengths include using a lightweight source of student data, affording early detection of struggling students, and predicting outcomes across terms in a natural setting (different final exams, minor changes to course content), across multiple courses in a curriculum, and across multiple institutions.

References

[1]
Yousef Mohamed Abdulrazzaq and Khalil Ibrahim Qayed. 2009. Could final year school grades suffice as a predictor for future performance? Med. Teach. 15, 2--3 (2009), 243--251.
[2]
Alireza Ahadi, Arto Hellas, and Raymond Lister. 2017. A contingency table derived method for analyzing course data. Trans. Comput. Educ. 17, 3, Article 13 (2017), 13:1--13:19.
[3]
Alireza Ahadi and Raymond Lister. 2013. Geek genes, prior knowledge, stumbling points and learning edge momentum: Parts of the one elephant? In Proceedings of the 9th Annual International ACM Conference on International Computing Education Research. 123--128.
[4]
Alireza Ahadi, Raymond Lister, Heikki Haapala, and Arto Vihavainen. 2015. Exploring machine-learning methods to automatically identify students in need of assistance. In Proceedings of the International Conference on Computing Education Research. 121--130.
[5]
A. Bandura. 1977. Self-efficacy: Toward a unifying theory of behavioral change. Psychol. Rev. 84, 2 (1977), 191--215.
[6]
Susan Bergin and Ronan Reilly. 2006. Predicting introductory programming performance: A multi-institutional multivariate study. Comput. Sci. Educ. 16, 4 (2006), 303--323.
[7]
Adam S. Carter, Christopher D. Hundhausen, and Olusola Adesope. 2017. Blending measures of programming and social behavior into predictive models of student achievement in early computing courses. Trans. Comput. Educ. 17, 3 (2017), 12:1--12:20.
[8]
Adam S. Carter, Christopher D. Hundhausen, and Olusola O. Adesope. 2015. The normalized programming state model—Predicting student performance in computing courses based on programming behavior. In Proceedings of the International Conference on Computing Education Research. 141--150.
[9]
Jennifer M. Case. 2015. A different route to reducing university drop-out rates. In The Conversation. Retrieved from https://theconversation.com/a-different-route-to-reducing-university-drop-out-rates-40406.
[10]
Karo Castro-Wunsch, Alireza Ahadi, and Andrew Petersen. 2017. Evaluating neural networks as a method for identifying students in need of assistance. In Proceedings of the Technical Symposium on Computer Science Education. 111--116.
[11]
Nihat Cengiz and Arban Uka. 2014. Prediction of student success using enrollment data. Proceedings of the Workshops held at Educational Data Mining: Workshop Approaching Twenty Years of Knowledge Tracing.
[12]
A. T. Corbett and J. R. Anderson. 1994. Knowledge tracing - Modeling the acquisition of procedural knowledge. User Model. User-Adapt. Interact. 4, 4 (1994), 253--278.
[13]
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273--297.
[14]
CRA Enrollment Committee Institution Subgroup. 2017. Generation CS: Computer science undergraduate enrollments surge since 2006. Computing Research Association. Retrieved from http://cra.org/data/Generation-CS/.
[15]
C. H. Crouch and E. Mazur. 2001. Peer instruction: Ten years of experience and results. Amer. J. Phys. 69, 9 (2001), 970--77.
[16]
Michael de Raadt, Margaret Hamilton, Raymond Lister, Jodi Tutty, Bob Baker, Ilona Box, Quintin Cutts, Sally Fincher, John Hamer, Patricia Haden, Marian Petre, Anthony Robins, Simon, Ken Sutton, and Denise Tolhurst. 2005. Approaches to learning in computer programming students and their effect on success. Res. Dev. Higher Educ.: Higher Educ. Chang. World 28 (2005), 407--414.
[17]
Edward M. Elias and Carl A. Lindsay. 1968. The Role of Intellective Variables in Achievement and Attrition of Associate Degree Students at the York Campus for the Years 1959 to 1963. Technical Report PSU -68 -7. Pennsylvania State University.
[18]
J. A. Hanley and B. J. McNeil. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 1 (1982), 29--36.
[19]
Norhayati Ibrahim, Steven A. Freeman, and Mack C. Shelley. 2011. Identifying predictors of academic success for part-time students at polytechnic institutes in Malaysia. Int. J. Adult Vocat. Educ. Technol. 2, 4 (2011), 1--16.
[20]
Matthew C. Jadud. 2006. Methods and tools for exploring novice compilation behaviour. In Proceedings of the International Conference on Computing Education Research. 73--84.
[21]
David James and Clair Chilvers. 2001. Academic and non-academic predictors of success on the Nottingham undergraduate medical course 1970-1995. Med. Educ. 35, 11 (2001), 1056--1064.
[22]
Cynthia Lee, Leo Porter, Beth Simon, and Daniel Zingaro. 2012. Peer instruction for computer science. Retrieved from http://www.peerinstruction4cs.org.
[23]
Cynthia Bailey Lee, Saturnino Garcia, and Leo Porter. 2013. Can peer instruction be effective in upper-division computer science courses? Trans. Comput. Educ. 13, 3 (2013), 12:1--12:22.
[24]
Soohyun Nam Liao, Daniel Zingaro, Michael A. Laurenzano, William G. Griswold, and Leo Porter. 2016. Lightweight, early identification of at-risk CS1 students. In Proceedings of the International Conference on Computing Education Research. 123--131.
[25]
Adam Lucas. 2009. Using peer instruction and I-clickers to enhance student participation in calculus. Prob. Resour. Issues Math. Undergrad. Studies 19, 3 (2009), 219--231.
[26]
National Center for Education Statistics. 2016. Total undergraduate fall enrollment in degree-granting postsecondary institutions, by attendance status, sex of student, and control and level of institution: Selected years, 1970 through 2026. National Center for Education Statistics. https://nces.ed.gov/programs/digest/d16/tables/dt16_303.70.asp.
[27]
Charles G. Petersen and Trevor G. Howe. 1979. Predicting academic success in introduction to computers. Assoc. Educ. Data Syst. 12, 4 (1979), 182--191.
[28]
Scott Pilzer. 2001. Peer instruction in physics and mathematics. Prob. Resour. Issues Math. Undergrad. Studies 11, 2 (2001), 185--192.
[29]
Leo Porter, Cynthia Bailey Lee, and Beth Simon. 2013. Halving fail rates using peer instruction: A study of four computer science courses. In Proceedings of the Technical Symposium on Computer Science Education. 177--182.
[30]
L. Porter, C. Bailey-Lee, B. Simon, Q. Cutts, and D. Zingaro. 2011. Experience report: A multi-classroom report on the value of peer instruction. In Proceedings of the Annual Joint Conference on Innovation and Technology in Computer Science Education. 138--142.
[31]
Leo Porter, Dennis Bouvier, Quintin Cutts, Scott Grissom, Cynthia Lee, Robert McCartney, Daniel Zingaro, and Beth Simon. 2016. A multi-institutional study of peer instruction in introductory computing. In Proceedings of the Technical Symposium on Computer Science Education. 358--363.
[32]
Leo Porter, Saturnino Garcia, John Glick, Andrew Matusiewicz, and Cynthia Taylor. 2013. Peer instruction in computer science at small liberal arts colleges. In Proceedings of the Annual Joint Conference on Innovation and Technology in Computer Science Education. 129--134.
[33]
Leo Porter and Beth Simon. 2013. Retaining nearly one-third more majors with a trio of instructional best practices in CS1. In Proceedings of the Technical Symposium on Computer Science Education. 165--170.
[34]
Leo Porter and Daniel Zingaro. 2014. Importance of early performance in CS1: Two conflicting assessment stories. In Proceedings of the Technical Symposium on Computer Science Education. 295--300.
[35]
Leo Porter, Daniel Zingaro, and Raymond Lister. 2014. Predicting student success using fine grain clicker data. In Proceedings of the International Conference on Computing Education Research. 51--58.
[36]
D. M. Powers. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2, 1 (2011), 37--63.
[37]
Anthony Robins. 2010. Leaning edge momentum: A new account of outcomes. In Computer Science Education, 20(1). 37--71.
[38]
John E. Roueche. 1967. Research studies of the junior college dropout. Amer. Assoc. Junior Coll. (1967), 1--5.
[39]
Philip M. Sadler and Robert H. Tai. 2007. Advanced placement exam scores as a predictor of performance in introductory college biology, chemistry and physics courses. Sci. Educat. 16, 2 (2007), 1--19.
[40]
William E. Cohen Sadler and Frederic L. Kockesen Levent. 1997. Factors affecting retention behavior: A model to predict at-risk students. In Proceedings of the Association for Institutional Research Annual Forum.
[41]
Vicki L. Sauter. 1986. Predicting computer programming skill. Computers 8 Education 10, 2 (1986), 299--302.
[42]
Sami Shaban and Michelle McLean. 2011. Predicting performance at medical school: Can we identify at-risk students? Adv. Med. Educ. Pract. 2 (2011), 139--148.
[43]
Karedn Shakerdge. 2016. High failure rates spur universities to overhaul math class. In The Hechinger Report. Retrieved from http://hechingerreport.org/high-failure-rates-spur-universities-overhaul-math-class/.
[44]
Shahireh Sharif, Larry Gifford, Gareth A. Morris, and Jill Barber. 2003. Can we predict student success (and reduce student failure)? Pharm. Educ. 3 (2003), 1--10.
[45]
B. Simon, M. Kohanfars, J. Lee, K. Tamayo, and Q. Cutts. 2010. Experience report: Peer instruction in introductory computing. In Proceedings of the Technical Symposium on Computer Science Education. 341--345.
[46]
Beth Simon, Julian Parris, and Jaime Spacco. 2013. How we teach impacts student learning: Peer instruction vs. lecture in CS0. In Proceedings of the Technical Symposium on Computer Science Education. 41--46.
[47]
Larry D. Singell and Glen R. Waddell. 2010. Modeling retention at a large public university: Can at-risk students be identified early enough to treat? Res. Higher Educ. 51, 6 (2010), 546--572.
[48]
Michelle K. Smith, William B. Wood, Wendy K. Adams, Carl E. Wieman, Jennifer K. Knight, Nancy Guild, and Tin Tin Su. 2009. Why peer discussion improves student performance on in-class concept questions. Science 323, 5910 (2009), 122--124.
[49]
Alex Smola, Kurt Hornik, Achim Zeileis, and Alexandros Karatzoglou. 2003. Kernel-based machine-learning lab. Retrieved from https://www.rdocumentation.org/packages/kernlab/versions/0.9-25.
[50]
Louise Tickle. 2015. How universities are using data to stop students dropping out. In The Guardian. Retrieved from https://www.theguardian.com/guardian-professional/.
[51]
Bruno Trstenjak and Dzenana Donko. 2014. Determining the impact of demographic features in predicting student success in Croatia. In Proceedings of the International Convention on Information and Communication Technology, Electronics, and Microelectronics. 1222--1227.
[52]
Christopher Watson, Frederick W. B. Li, and Jamie L. Godwin. 2013. Predicting performance in an introductory programming course by logging and analyzing student programming behavior. In Proceedings of the International Conference on Advanced Learning Technologies. 319--323.
[53]
Brenda Cantwell Wilson and Sharon Shrock. 2001. Contributing to success in an introductory computer science course—A study of twelve factors. In Proceedings of the Technical Symposium on Computer Science Education. 184--188.
[54]
Annika Wolff, Zdenek Zdráhal, Drahomira Herrmannova, and Petr Knoth. 2014. Predicting student performance from combined data sources. Educ. Data Mining 524 (2014), 175--202.
[55]
Ping Zhang, Lin Ding, and Eric Mazur. 2017. Peer instruction in introductory physics: A method to bring about positive changes in students’ attitudes and beliefs. Phys. Rev. Phys. Educ. Res. 113, 1 (2017), 10.

Cited By

View all

Index Terms

  1. A Robust Machine Learning Technique to Predict Low-performing Students

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Computing Education
    ACM Transactions on Computing Education  Volume 19, Issue 3
    September 2019
    333 pages
    EISSN:1946-6226
    DOI:10.1145/3308443
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 January 2019
    Accepted: 01 August 2018
    Revised: 01 August 2018
    Received: 01 May 2018
    Published in TOCE Volume 19, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Peer instruction
    2. at-risk students
    3. clicker data
    4. cross-term
    5. machine learning
    6. multi-institution
    7. prediction

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)521
    • Downloads (Last 6 weeks)54
    Reflects downloads up to 22 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media