Options
CodEX: Source Code Plagiarism Detection Based on Abstract Syntax Trees
Author(s)
Date Issued
2018-12-07
Date Available
2024-05-03T15:32:09Z
Abstract
CodEX is a source code search engine that allows users to search a repository of source code snippets using source code snippets as the query also. A potential use for such a search engine is to help educators identify cases of plagiarism in students' programming assignments. This paper evaluates CodEX in this context. Abstract Syntax Trees (ASTs) are used to represent source code files on an abstract level. This, combined with node hashing and similarity calculations, allows users to search for source code snippets that match suspected plagiarism cases. A number of commonly-employed techniques to avoid plagiarism detection are identified, and the CodEX system is evaluated for its ability to detect plagiarism cases even when these techniques are employed. Evaluation results are promising, with 95% of test cases being identified successfully.
Type of Material
Conference Publication
Publisher
CEUR Workshop Proceedings
Series
CEUR Workshop Proceedings
2259
Language
English
Status of Item
Peer reviewed
Journal
Brennan, R., Beel, J., Byrne, R., Debattista, J., Crotti Junior, A. (eds.). AICS 2018: Proceedings for the 26th AIAI Irish Conference on Artificial Intelligence and Cognitive Science, Trinity College Dublin Dublin, Ireland, December 6-7th, 2018.
Conference Details
The 26th Irish Conference on Artificial Intelligence and Cognitive Science (AICS 2018), Trinity College Dublin, Ireland, 6-7 December 2018
ISSN
1613-0073
This item is made available under a Creative Commons License
File(s)
Loading...
Name
Zheng2018.pdf
Size
642.51 KB
Format
Adobe PDF
Checksum (MD5)
39758dd713e83da02372ca5e56b8993b
Owning collection