skip to main content
10.1145/2851581.2892512acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
abstract

Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground

Published: 07 May 2016 Publication History

Abstract

Plagiarism in online learning environments has a detrimental effect on the trust of online courses and their viability. Automatic plagiarism detection systems do exist yet the specific situation in online courses restricts their use. To allow for easy automated grading, online assignments usually are less open and instead require students to fill in small gaps. Therefore solutions tend to be very similar, yet are then not necessarily plagiarized. In this paper we propose a new approach to detect code re-use that increases the prediction accuracy by dynamically removing parts in assignments which are part of almost every assignment--the so called common ground. Our approach shows significantly better F-measure and Cohen's Kappa results than other state of the art algorithms such as Moss or JPlag. The proposed method is also language agnostic to the point that training and test data sets can be taken from different programming languages.

References

[1]
Ahtiainen, A., Surakka, S., and Rahikainen, M. Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises. In Proceedings of the 6th Baltic Sea conference on Computing education research: Koli Calling (2006), 141-142.
[2]
Cooper, S., and Sahami, M. Reflections on Stanford's MOOCs. Commun. ACM 56, 2 (Feb. 2013), 28-30.
[3]
Corrigan-Gibbs, H., Gupta, N., Northcutt, C., Cutrell, E., and Thies, W. Measuring and Maximizing the Effectiveness of Honor Codes in Online Courses. In Proceedings of the Second ACM Conference on Learning @ Scale (2015), 223-228.
[4]
Faidhi, J. A., and Robinson, S. K. An empirical approach for detecting program similarity and plagiarism within a university programming environment. Computers & Education 11, 1 (1987), 11-19.
[5]
Flores, E., Rosso, P., Moreno, L., and Villatoro-Tello, E. PAN@FIRE: Overview of SOCO Track on the Detection of SOurce COde Re-use. In Sixth Forum for Information Retrieval Evaluation (2014).
[6]
Krause, M., Mogalle, M., Pohl, H., and Williams, J. J. A playful game changer: Fostering student retention in online education with social gamification. In Proceedings of the second ACM conference on Learning @ scale - L@S '15 (2015).
[7]
Mann, S., and Frew, Z. Similarity and Originality in Code: Plagiarism and Normal Variation in Student Assignments. In Proceedings of the 8th Australasian Conference on Computing Education - Volume 52 (2006), 143-150.
[8]
Moussiades, L., and Vakali, A. PDetect: A Clustering Approach for Detecting Plagiarism in Source Code Datasets. The Computer Journal 48, 6 (2005), 651-661.
[9]
Poon, J. Y., Sugiyama, K., Tan, Y. F., and Kan, M.-Y. Instructor-Centric Source Code Plagiarism Detection and Plagiarism Corpus. In Proceedings of the 17th ACM annual conference on Innovation and technology in computer science education (2012), 122-127.
[10]
Prechelt, L., Malpohl, G., and Philippsen, M. Finding Plagiarisms among a Set of Programs with JPlag. Journal of Universal Computer Science 8, 11 (2002), 1016-1038.
[11]
Rosales, F., Garcia, A., Rodriguez, S., Pedraza, J. L., Mendez, R., and Nieto, M. M. Detection of Plagiarism in Programming Assignments. IEEE Transactions on Education 51, 2 (May 2008), 174-183.
[12]
Schleimer, S., Wilkerson, D. S., and Aiken, A. Winnowing: Local Algorithms for Document Fingerprinting. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data (2003), 76-85.

Cited By

View all

Index Terms

  1. Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground

      Recommendations

      Reviews

      Stewart Mark Godwin

      As the student-teacher ratio increases and institutions move toward larger class sizes, educators are faced with increased assessment workloads. Online education is one area where extremely large numbers of students enroll in a single course, which can necessitate a larger assessment workload for the lecturer. In assessing student work, the issue of plagiarism is very significant for educators as they grade submissions, which can be very similar. Assessment tasks that require students to write computer code for simple algorithms will always result in similar solutions. This paper proposes a method that will improve plagiarism detection for computer science courses by dynamically removing parts of the assessment task that are common to all submissions. In the paper, the authors outline their method and support their system with an analysis of data from the training corpus provided by the SOCO 2014 challenge. Although the mathematics of the results can be daunting for the layperson, the conclusions are easily understood. The authors have shown their method is significantly better than comparable options currently available to educators. This is an interesting paper and should be read by all computer science educators who are concerned with detecting plagiarism in their courses. Online Computing Reviews Service

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems
      May 2016
      3954 pages
      ISBN:9781450340823
      DOI:10.1145/2851581
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 May 2016

      Check for updates

      Author Tags

      1. computer science education
      2. massive open online course
      3. plagiarism

      Qualifiers

      • Abstract

      Conference

      CHI'16
      Sponsor:
      CHI'16: CHI Conference on Human Factors in Computing Systems
      May 7 - 12, 2016
      California, San Jose, USA

      Acceptance Rates

      CHI EA '16 Paper Acceptance Rate 1,000 of 5,000 submissions, 20%;
      Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

      Upcoming Conference

      CHI 2025
      ACM CHI Conference on Human Factors in Computing Systems
      April 26 - May 1, 2025
      Yokohama , Japan

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 02 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media