Abstract
In this paper, we tackle the problem of detecting academic plagiarism, which is considered as a severe problem owing to the convenience of online publishing. Typical information retrieval methods, stopword-based methods and fingerprinting methods, are commonly used to detect plagiarism by using the sequence of words as they appear in the article. As such, they fail to detect plagiarism when an author reconstructs a source article by re-ordering and recombining phrases. Because graph structure fits for representing relationships between entities, we propose a novel plagiarism detection method, in which we use graphs to represent documents by modeling grammatical relationships between words. Experimental results show that our proposed method outperforms two n-gram methods and increases recall values by 10 to 20%.
Original language | English |
---|---|
Pages (from-to) | 293-304 |
Number of pages | 12 |
Journal | Revue des Nouvelles Technologies de l'Information |
Volume | E.24 |
Publication status | Published - 2013 |
Event | 13emes Journees Francophones sur l'Extraction et la Gestion des Connaissances, EGC 2013 - 13th French-Speaking Conference on Knowledge Discovery and Management, EGC 2013 - Toulouse, France Duration: Jan 29 2013 → Feb 1 2013 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Computer Science Applications
- Information Systems
- Software