Same Same But Different: Preventing Refactoring Attacks on Software Plagiarism Detection
By: Robin Maisch , Larissa Schmid , Timur Sağlam and more
Potential Business Impact:
Finds copied computer code even when changed.
Plagiarism detection in programming education faces growing challenges due to increasingly sophisticated obfuscation techniques, particularly automated refactoring-based attacks. While code plagiarism detection systems used in education practice are resilient against basic obfuscation, they struggle against structural modifications that preserve program behavior, especially caused by refactoring-based obfuscation. This paper presents a novel and extensible framework that enhances state-of-the-art detectors by leveraging code property graphs and graph transformations to counteract refactoring-based obfuscation. Our comprehensive evaluation of real-world student submissions, obfuscated using both algorithmic and AI-based obfuscation attacks, demonstrates a significant improvement in detecting plagiarized code.
Similar Papers
Evaluating Software Plagiarism Detection in the Age of AI: Automated Obfuscation and Lessons for Academic Integrity
Software Engineering
Finds copied code even when hidden by AI.
On Plagiarism and Software Plagiarism
Software Engineering
Finds copied computer code to stop cheating.
The Failure of Plagiarism Detection in Competitive Programming
Computers and Society
Finds new ways to catch students cheating on code.