Zheng, MengyaMengyaZhengPan, XingyuXingyuPanLillis, DavidDavidLillis2024-05-032024-05-032018-12-071613-0073http://hdl.handle.net/10197/25828The 26th Irish Conference on Artificial Intelligence and Cognitive Science (AICS 2018), Trinity College Dublin, Ireland, 6-7 December 2018CodEX is a source code search engine that allows users to search a repository of source code snippets using source code snippets as the query also. A potential use for such a search engine is to help educators identify cases of plagiarism in students' programming assignments. This paper evaluates CodEX in this context. Abstract Syntax Trees (ASTs) are used to represent source code files on an abstract level. This, combined with node hashing and similarity calculations, allows users to search for source code snippets that match suspected plagiarism cases. A number of commonly-employed techniques to avoid plagiarism detection are identified, and the CodEX system is evaluated for its ability to detect plagiarism cases even when these techniques are employed. Evaluation results are promising, with 95% of test cases being identified successfully.enSearch enginesSource codeAbstract syntax treesPlagiarism detection avoidanceCodEX: Source Code Plagiarism Detection Based on Abstract Syntax TreesConference Publication2021-02-12https://creativecommons.org/licenses/by-nc-nd/3.0/ie/