abstract

Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground

Authors:

Christian Domin,

Henning Pohl,

Markus KrauseAuthors Info & Claims

CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems

Pages 1173 - 1179

https://doi.org/10.1145/2851581.2892512

Published: 07 May 2016 Publication History

Get Access

Abstract

Plagiarism in online learning environments has a detrimental effect on the trust of online courses and their viability. Automatic plagiarism detection systems do exist yet the specific situation in online courses restricts their use. To allow for easy automated grading, online assignments usually are less open and instead require students to fill in small gaps. Therefore solutions tend to be very similar, yet are then not necessarily plagiarized. In this paper we propose a new approach to detect code re-use that increases the prediction accuracy by dynamically removing parts in assignments which are part of almost every assignment--the so called common ground. Our approach shows significantly better F-measure and Cohen's Kappa results than other state of the art algorithms such as Moss or JPlag. The proposed method is also language agnostic to the point that training and test data sets can be taken from different programming languages.

References

[1]

Ahtiainen, A., Surakka, S., and Rahikainen, M. Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises. In Proceedings of the 6th Baltic Sea conference on Computing education research: Koli Calling (2006), 141-142.

Digital Library

Google Scholar

[2]

Cooper, S., and Sahami, M. Reflections on Stanford's MOOCs. Commun. ACM 56, 2 (Feb. 2013), 28-30.

Digital Library

Google Scholar

[3]

Corrigan-Gibbs, H., Gupta, N., Northcutt, C., Cutrell, E., and Thies, W. Measuring and Maximizing the Effectiveness of Honor Codes in Online Courses. In Proceedings of the Second ACM Conference on Learning @ Scale (2015), 223-228.

Digital Library

Google Scholar

[4]

Faidhi, J. A., and Robinson, S. K. An empirical approach for detecting program similarity and plagiarism within a university programming environment. Computers & Education 11, 1 (1987), 11-19.

Digital Library

Google Scholar

[5]

Flores, E., Rosso, P., Moreno, L., and Villatoro-Tello, E. PAN@FIRE: Overview of SOCO Track on the Detection of SOurce COde Re-use. In Sixth Forum for Information Retrieval Evaluation (2014).

Digital Library

Google Scholar

[6]

Krause, M., Mogalle, M., Pohl, H., and Williams, J. J. A playful game changer: Fostering student retention in online education with social gamification. In Proceedings of the second ACM conference on Learning @ scale - L@S '15 (2015).

Digital Library

Google Scholar

[7]

Mann, S., and Frew, Z. Similarity and Originality in Code: Plagiarism and Normal Variation in Student Assignments. In Proceedings of the 8th Australasian Conference on Computing Education - Volume 52 (2006), 143-150.

Digital Library

Google Scholar

[8]

Moussiades, L., and Vakali, A. PDetect: A Clustering Approach for Detecting Plagiarism in Source Code Datasets. The Computer Journal 48, 6 (2005), 651-661.

Digital Library

Google Scholar

[9]

Poon, J. Y., Sugiyama, K., Tan, Y. F., and Kan, M.-Y. Instructor-Centric Source Code Plagiarism Detection and Plagiarism Corpus. In Proceedings of the 17th ACM annual conference on Innovation and technology in computer science education (2012), 122-127.

Digital Library

Google Scholar

[10]

Prechelt, L., Malpohl, G., and Philippsen, M. Finding Plagiarisms among a Set of Programs with JPlag. Journal of Universal Computer Science 8, 11 (2002), 1016-1038.

Google Scholar

[11]

Rosales, F., Garcia, A., Rodriguez, S., Pedraza, J. L., Mendez, R., and Nieto, M. M. Detection of Plagiarism in Programming Assignments. IEEE Transactions on Education 51, 2 (May 2008), 174-183.

Digital Library

Google Scholar

[12]

Schleimer, S., Wilkerson, D. S., and Aiken, A. Winnowing: Local Algorithms for Document Fingerprinting. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data (2003), 76-85.

Digital Library

Google Scholar

Cited By

View all

Karnalim O(2023)Maintaining Academic Integrity in Programming: Locality-Sensitive Hashing and RecommendationsEducation Sciences10.3390/educsci1301005413:1(54)Online publication date: 3-Jan-2023
https://doi.org/10.3390/educsci13010054
Karnalim OSimon Sherriff MMerkle LCutter PMonge ASheard J(2021)Common Code Segment Selection: Semi-Automated Approach and EvaluationProceedings of the 52nd ACM Technical Symposium on Computer Science Education10.1145/3408877.3432436(335-341)Online publication date: 3-Mar-2021
https://dl.acm.org/doi/10.1145/3408877.3432436
Aniceto RHolanda MCastanho CDa Silva D(2021)Source Code Plagiarism Detection in an Educational Context: A Literature Mapping2021 IEEE Frontiers in Education Conference (FIE)10.1109/FIE49875.2021.9637155(1-9)Online publication date: 13-Oct-2021
https://dl.acm.org/doi/10.1109/FIE49875.2021.9637155
Show More Cited By

Index Terms

Improving Plagiarism Detection in Coding Assignments by Dynamic Removal of Common Ground
1. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Computing education programs
        Computer science education
        Information science education

Recommendations

Detection of plagiarism in computer programming assignments

Plagiarism in programming assignments in computer science courses is on the rise, mainly due to recent innovation in computer technology which has made copying, sharing, and modifying a document effortless. Detecting plagiarism in computer programs ...
Obfuscating plagiarism detection: vulnerabilities and solutions
CompSysTech '11: Proceedings of the 12th International Conference on Computer Systems and Technologies

Plagiarism among student term papers is considered as a major problem these days. To successfully identify this kind of cheating we have to perform check on submitted papers for plagiarism. This has to be done with appropriate plagiarism detection ...
Computer-based plagiarism detection methods and tools: an overview
CompSysTech '07: Proceedings of the 2007 international conference on Computer systems and technologies

The paper is dedicated to plagiarism problem. The ways how to reduce plagiarism: both: plagiarism prevention and plagiarism detection are discussed. Widely used plagiarism detection methods are described. The most known plagiarism detection tools are ...

Reviews

Reviewer: Stewart Mark Godwin

As the student-teacher ratio increases and institutions move toward larger class sizes, educators are faced with increased assessment workloads. Online education is one area where extremely large numbers of students enroll in a single course, which can necessitate a larger assessment workload for the lecturer. In assessing student work, the issue of plagiarism is very significant for educators as they grade submissions, which can be very similar. Assessment tasks that require students to write computer code for simple algorithms will always result in similar solutions. This paper proposes a method that will improve plagiarism detection for computer science courses by dynamically removing parts of the assessment task that are common to all submissions. In the paper, the authors outline their method and support their system with an analysis of data from the training corpus provided by the SOCO 2014 challenge. Although the mathematics of the results can be daunting for the layperson, the conclusions are easily understood. The authors have shown their method is significantly better than comparable options currently available to educators. This is an interesting paper and should be read by all computer science educators who are concerned with detecting plagiarism in their courses. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems

May 2016

3954 pages

ISBN:9781450340823

DOI:10.1145/2851581

General Chairs:
Jofish Kaye
Yahoo
,
Allison Druin
University of Maryland / National Park Service
,
Program Chairs:
Cliff Lampe
University of Michigan
,
Dan Morris
Microsoft
,
Juan Pablo Hourcade
University of Iowa

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2016

Check for updates

Author Tags

Qualifiers

Abstract

Conference

CHI'16

Sponsor:

SIGCHI

CHI'16: CHI Conference on Human Factors in Computing Systems

May 7 - 12, 2016

California, San Jose, USA

Acceptance Rates

CHI EA '16 Paper Acceptance Rate 1,000 of 5,000 submissions, 20%;

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
251
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)1

Reflects downloads up to 02 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Karnalim O(2023)Maintaining Academic Integrity in Programming: Locality-Sensitive Hashing and RecommendationsEducation Sciences10.3390/educsci1301005413:1(54)Online publication date: 3-Jan-2023
https://doi.org/10.3390/educsci13010054
Karnalim OSimon Sherriff MMerkle LCutter PMonge ASheard J(2021)Common Code Segment Selection: Semi-Automated Approach and EvaluationProceedings of the 52nd ACM Technical Symposium on Computer Science Education10.1145/3408877.3432436(335-341)Online publication date: 3-Mar-2021
https://dl.acm.org/doi/10.1145/3408877.3432436
Aniceto RHolanda MCastanho CDa Silva D(2021)Source Code Plagiarism Detection in an Educational Context: A Literature Mapping2021 IEEE Frontiers in Education Conference (FIE)10.1109/FIE49875.2021.9637155(1-9)Online publication date: 13-Oct-2021
https://dl.acm.org/doi/10.1109/FIE49875.2021.9637155
John SBoateng G(2021)“I didn’t copy his code”: Code Plagiarism Detection with Visual ProofArtificial Intelligence in Education10.1007/978-3-030-78270-2_37(208-212)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1007/978-3-030-78270-2_37
Simon Karnalim OSheard JDema IKarkare ALeinonen JLiut MMcCauley RRößling GKrogstie BGiannakos MSindre GLuxton-Reilly ADivitini M(2020)Choosing Code Segments to Exclude from Code Similarity DetectionProceedings of the Working Group Reports on Innovation and Technology in Computer Science Education10.1145/3437800.3439201(1-19)Online publication date: 17-Jun-2020
https://dl.acm.org/doi/10.1145/3437800.3439201
Karnalim OSimon Chivers W(2020)Preprocessing for Source Code Similarity Detection in Introductory ProgrammingProceedings of the 20th Koli Calling International Conference on Computing Education Research10.1145/3428029.3428065(1-10)Online publication date: 19-Nov-2020
https://dl.acm.org/doi/10.1145/3428029.3428065
Bradley SBradley SDevlin M(2020)Creative Assessment in ProgrammingProceedings of the 4th Conference on Computing Education Practice10.1145/3372356.3372369(1-4)Online publication date: 9-Jan-2020
https://dl.acm.org/doi/10.1145/3372356.3372369
Simon Karnalim OSheard JDema IKarkare ALeinonen JLiut MMcCauley RGiannakos MSindre GLuxton-Reilly ADivitini M(2020)Selection of Code Segments for Exclusion from Code Similarity DetectionProceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education10.1145/3341525.3394987(500-501)Online publication date: 15-Jun-2020
https://dl.acm.org/doi/10.1145/3341525.3394987
Novak MJoy MKermek D(2019)Source-code Similarity Detection and Detection Tools Used in AcademiaACM Transactions on Computing Education10.1145/331329019:3(1-37)Online publication date: 21-May-2019
https://dl.acm.org/doi/10.1145/3313290
Karnalim OSimon Chivers W(2019)Similarity Detection Techniques for Academic Source Code Plagiarism and Collusion: A Review2019 IEEE International Conference on Engineering, Technology and Education (TALE)10.1109/TALE48000.2019.9225953(1-8)Online publication date: Dec-2019
https://doi.org/10.1109/TALE48000.2019.9225953
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Detection of plagiarism in computer programming assignments

Obfuscating plagiarism detection: vulnerabilities and solutions

Computer-based plagiarism detection methods and tools: an overview

Reviews

Access critical reviews of Computing literature here

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations