skip to main content
10.1145/3314221.3314629acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article
Public Access

SemCluster: clustering of imperative programming assignments based on quantitative semantic features

Published: 08 June 2019 Publication History

Abstract

A fundamental challenge in automated reasoning about programming assignments at scale is clustering student submissions based on their underlying algorithms. State-of-the-art clustering techniques are sensitive to control structure variations, cannot cluster buggy solutions with similar correct solutions, and either require expensive pair-wise program analyses or training efforts. We propose a novel technique that can cluster small imperative programs based on their algorithmic essence: (A) how the input space is partitioned into equivalence classes and (B) how the problem is uniquely addressed within individual equivalence classes. We capture these algorithmic aspects as two quantitative semantic program features that are merged into a program's vector representation. Programs are then clustered using their vector representations. The computation of our first semantic feature leverages model counting to identify the number of inputs belonging to an input equivalence class. The computation of our second semantic feature abstracts the program's data flow by tracking the number of occurrences of a unique pair of consecutive values of a variable during its lifetime. The comprehensive evaluation of our tool SemCluster on benchmarks drawn from solutions to small programming assignments shows that SemCluster (1) generates far fewer clusters than other clustering techniques, (2) precisely identifies distinct solution strategies, and (3) boosts the performance of clustering-based program repair, all within a reasonable amount of time.

Supplementary Material

WEBM File (p860-perry.webm)
MP4 File (3314221.3314629.mp4)
Video Presentation

References

[1]
[n. d.]. CodeChef. https://www.codechef.com/.
[2]
[n. d.]. Codeforces. http://codeforces.com/.
[3]
[n. d.]. HackerRank. https://www.hackerrank.com//.
[4]
2017. The 50 Most Popular MOOCs of All Time. https://www.onlinecoursereport.com/the-50-most-popular-moocs-of-all-time/.
[5]
Boris Beizer. 2003. Software Testing Techniques. Dreamtech Press.
[6]
Judith Bishop, R. Nigel Horspool, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux. 2015. Code Hunt: Experience with Coding Contests at Scale. In Proceedings of the 37th International Conference on Software Engineering - Volume 2 (ICSE '15). IEEE Press, Piscataway, NJ, USA, 398-407. http://dl.acm.org/citation.cfm?id=2819009.2819072.
[7]
Dmitry Chistikov, Rayna Dimitrova, and Rupak Majumdar. 2017. Approximate Counting in SMT and Value Estimation for Probabilistic Programs. Acta Informatica 54, 8 (2017), 729-764.
[8]
Loris D'Antoni, Roopsha Samanta, and Rishabh Singh. 2016. Qlose: Program Repair with Quantitative Objectives. In International Conference on Computer Aided Verification. Springer, Toronto, Ontario, Canada, 383-401.
[9]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, Warsaw, Poland, 337-340.
[10]
Anna Drummond, Yanxin Lu, Swarat Chaudhuri, Christopher Jermaine, Joe Warren, and Scott Rixner. 2014. Learning to Grade Student Programs in a Massive Open Online Course. In Proceedings of the 2014 IEEE International Conference on Data Mining (ICDM '14). IEEE Computer Society, Washington, DC, USA, 785-790.
[11]
Matthew Fredrikson and Somesh Jha. 2014. Satisfiability Modulo Counting: A New Approach for Analyzing Privacy Properties. In Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) (CSL-LICS '14). ACM, New York, NY, USA, Article 42, 10 pages.
[12]
Mark Gabel, Lingxiao Jiang, and Zhendong Su. 2008. Scalable Detection of Semantic Clones. In Proceedings of the 30th International Conference on Software Engineering (ICSE '08). ACM, New York, NY, USA, 321-330.
[13]
Elena L Glassman, Jeremy Scott, Rishabh Singh, Philip J Guo, and Robert C Miller. 2015. OverCode: Visualizing Variation in Student Solutions to Programming Problems at Scale. ACM Transactions on Computer-Human Interaction (TOCHI) 22, 2 (2015), 7.
[14]
Sumit Gulwani, Ivan Radi?ek, and Florian Zuleger. 2018. Automated Clustering and Program Repair for Introductory Programming Assignments. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, New York, NY, USA, 465-480.
[15]
Andrew Head, Elena Glassman, Gustavo Soares, Ryo Suzuki, Lucas Figueredo, Loris D'Antoni, and Björn Hartmann. 2017. Writing Reusable Code Feedback at Scale with Mixed-Initiative Program Synthesis. In Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale (L@S '17). ACM, New York, NY, USA, 89-98.
[16]
Jonathan Huang, Chris Piech, Andy Nguyen, and Leonidas Guibas. 2013. Syntactic and Functional Variability of a Million Code Submissions in a Machine Learning MOOC. In AIED 2013 Workshops Proceedings Volume, Vol. 25.
[17]
Jeong-Hoon Ji, Gyun Woo, and Hwan-Gue Cho. 2007. A Source Code Linearization Technique for Detecting Plagiarized Programs. In Proceedings of the 12th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education (ITiCSE '07). ACM, New York, NY, USA, 73-77.
[18]
Shalini Kaleeswaran, Anirudh Santhiar, Aditya Kanade, and Sumit Gulwani. 2016. Semi-supervised Verified Feedback Generation. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). ACM, New York, NY, USA, 739-750.
[19]
Dohyeong Kim, Yonghwi Kwon, Peng Liu, I Luk Kim, David Mitchel Perry, Xiangyu Zhang, and Gustavo Rodriguez-Rivera. 2016. Apex: Automatic Programming Assignment Error Explanation. ACM SIGPLAN Notices 51, 10 (2016), 311-327.
[20]
Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization (CGO '04). IEEE Computer Society, Washington, DC, USA, 75-. http://dl.acm.org/citation.cfm?id=977395.977673.
[21]
Chao Liu, Chen Chen, Jiawei Han, and Philip S. Yu. 2006. GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '06). ACM, New York, NY, USA, 872-881.
[22]
Lannan Luo and Qiang Zeng. 2016. SolMiner: Mining Distinct Solutions in Programs. In Proceedings of the 38th International Conference on Software Engineering Companion (ICSE '16). ACM, New York, NY, USA, 481-490.
[23]
Feifei Ma, Sheng Liu, and Jian Zhang. 2009. Volume Computation for Boolean Combination of Linear Arithmetic Constraints. In International Conference on Automated Deduction. Springer, Montreal, Canada, 453-468.
[24]
Andy Nguyen, Christopher Piech, Jonathan Huang, and Leonidas Guibas. 2014. Codewebs: Scalable Homework Search for Massive Open Online Programming Courses. In Proceedings of the 23rd International Conference on World Wide Web (WWW'14). ACM, New York, NY, USA, 491-502.
[25]
Sagar Parihar, Ziyaan Dadachanji, Praveen Kumar Singh, Rajdeep Das, Amey Karkare, and Arnab Bhattacharya. 2017. Automatic Grading and Feedback Using Program Repair for Introductory Programming Courses. In Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education (ITiCSE '17). ACM, New York, NY, USA, 92-97.
[26]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825-2830.
[27]
Hao Peng, Lili Mou, Ge Li, Yuxuan Liu, Lu Zhang, and Zhi Jin. 2015. Building Program Vector Representations for Deep Learning. In International Conference on Knowledge Science, Engineering and Management. Springer, Chongqing, China, 547-553.
[28]
Chris Piech, Jonathan Huang, Andy Nguyen, Mike Phulsuksombati, Mehran Sahami, and Leonidas Guibas. 2015. Learning Program Embeddings to Propagate Feedback on Student Code. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37 (ICML'15). JMLR.org, Lille, France, 1093-1102. http://dl.acm.org/citation.cfm?id=3045118.3045235.
[29]
Lutz Prechelt, Guido Malpohl, and Michael Philippsen. 2002. Finding Plagiarisms Among a Set of Programs with JPlag. Journal of Universal Computer Science 8, 11 (2002), 1016.
[30]
Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, and Regina Barzilay. 2016. SkP: A Neural Program Corrector for MOOCs. In Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH Companion 2016). ACM, New York, NY, USA, 39-40.
[31]
Zvonimir Rakamari? and Alan J. Hu. 2009. A Scalable Memory Model for Low-Level Code. In Proceedings of the 10th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI '09). Springer-Verlag, Berlin, Heidelberg, 290-304.
[32]
Kelly Rivers and Kenneth R Koedinger. 2013. Automatic Generation of Programming Feedback: A Data-driven Approach. In The First Workshop on AI-supported Education for Computer Science (AIEDCS 2013), Vol. 50.
[33]
Kelly Rivers and Kenneth R. Koedinger. 2015. Data-Driven Hint Generation in Vast Solution Spaces: a Self-Improving Python Programming Tutor. International Journal of Artificial Intelligence in Education (2015), 1-28.
[34]
Saul Schleimer, Daniel S. Wilkerson, and Alex Aiken. 2003. Winnowing: Local Algorithms for Document Fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD '03). ACM, New York, NY, USA, 76-85.
[35]
Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated Feedback Generation for Introductory Programming Assignments. ACM SIGPLAN Notices 48, 6 (2013), 15-26.
[36]
Alexander Strehl and Joydeep Ghosh. 2002. Cluster Ensembles--a Knowledge Reuse Framework for Combining Multiple Partitions. Journal of machine learning research 3, Dec (2002), 583-617.
[37]
Haruaki Tamada, Keiji Okamoto, Masahide Nakamura, Akito Monden, and Ken-ichi Matsumoto. 2004. Dynamic Software Birthmarks to Detect the Theft of Windows Applications. In International Symposium on Future Software Technology, Vol. 20. Citeseer.
[38]
Marc Thurley. 2006. sharpSAT-Counting Models with Advanced Component Caching and Implicit BCP. In International Conference on Theory and Applications of Satisfiability Testing. Springer, 424-429.
[39]
Nghi Truong, Paul Roe, and Peter Bancroft. 2004. Static Analysis of Students' Java Programs. In Proceedings of the Sixth Australasian Conference on Computing Education - Volume 30 (ACE '04). Australian Computer Society, Inc., Darlinghurst, Australia, Australia, 317-325. http://dl.acm.org/citation.cfm?id=979968.980011.
[40]
Ke Wang, Rishabh Singh, and Zhendong Su. 2018. Search, Align, and Repair: Data-driven Feedback Generation for Introductory Programming Exercises. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, NewYork, NY, USA, 481-495.
[41]
Ke Wang, Zhendong Su, and Rishabh Singh. 2018. Dynamic Neural Program Embeddings for Program Repair. In International Conference on Learning Representations.
[42]
Xinran Wang, Yoon-Chan Jhi, Sencun Zhu, and Peng Liu. 2009. Behavior Based Software Theft Detection. In Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS '09). ACM, New York, NY, USA, 280-290.
[43]
Xinran Wang, Yoon-Chan Jhi, Sencun Zhu, and Peng Liu. 2009. Detecting Software Theft via System Call Based Birthmarks. In Proceedings of the 2009 Annual Computer Security Applications Conference (ACSAC '09). IEEE Computer Society, Washington, DC, USA, 149-158.
[44]
Wei Wei and Bart Selman. 2005. A New Approach to Model Counting. In International Conference on Theory and Applications of Satisfiability Testing. Springer, 324-339.
[45]
Songwen Xu and Yam San Chee. 2003. Transformation-based Diagnosis of Student Programs for Programming Tutoring Systems. IEEE Transactions on Software Engineering 29, 4 (2003), 360-384.
[46]
Wuu Yang. 1991. Identifying Syntactic Differences Between Two Programs. Software: Practice and Experience 21, 7 (1991), 739-755.

Cited By

View all

Index Terms

  1. SemCluster: clustering of imperative programming assignments based on quantitative semantic features

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation
      June 2019
      1162 pages
      ISBN:9781450367127
      DOI:10.1145/3314221
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 June 2019

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Program analysis
      2. Program clustering
      3. Quantitative reasoning

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      PLDI '19
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 406 of 2,067 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)137
      • Downloads (Last 6 weeks)23
      Reflects downloads up to 16 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media