Keywords

plagiarism, sentence similarity, copyright, SimPaD

Abstract

Plagiarism is a serious problem that infringes copyrighted documents/materials, which is an unethical practice and decreases the economic incentive received by authors (owners) of the original copies. Unfortunately, plagiarism is getting worse due to the increasing number of online publications on the Web, which facilitates locating and paraphrasing information. In solving this problem, we propose a novel plagiarism-detection method, called SimPaD, which (i) establishes the degree of resemblance between any two documents D1 and D2 based on their sentence-to-sentence similarity computed by using pre-defined word-correlation factors, and (ii) generates a graphical view of sentences that are similar (or the same) in D1 and D2. Experimental results verify that SimPaD is highly accurate in detecting (non-)plagiarized documents and outperforms existing plagiarism-detection approaches.

Original Publication Citation

Nathaniel Gustafson, Maria Soledad Pera, and Yiu-Kai Ng. "Nowhere to Hide: Finding Plagiarized Documents Based on Sentence Similarity." In Proceedings of the 28 IEEE/WIC/ACM International Conference on Web Intelligence (WI'8), pp. 69-696, December 9-12, 28, Sydney, Australia.

BYU ScholarsArchive Citation

Gustafson, Nathaniel; Ng, Yiu-Kai D.; and Pera, Maria Soledad, "Nowhere to Hide: Finding Plagiarized Documents Based on Sentence Similarity" (2008). Faculty Publications. 150.
https://scholarsarchive.byu.edu/facpub/150

Document Type

Peer-Reviewed Article

Publication Date

2008-12-09

Permanent URL

http://hdl.lib.byu.edu/1877/2632

Publisher

IEEE

Language

English

College

Physical and Mathematical Sciences

Department

Computer Science

Copyright Status

© 2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Copyright Use Information

http://lib.byu.edu/about/copyright/

Download

Included in

Computer Sciences Commons

COinS

BYU ScholarsArchive

Faculty Publications

Nowhere to Hide: Finding Plagiarized Documents Based on Sentence Similarity

Keywords

Abstract

Original Publication Citation

BYU ScholarsArchive Citation

Document Type

Publication Date

Permanent URL

Publisher

Language

College

Department

Copyright Status

Copyright Use Information

Included in

Search

Browse

BYU Links

Author Corner

Hosted by the

BYU ScholarsArchive

Faculty Publications

Nowhere to Hide: Finding Plagiarized Documents Based on Sentence Similarity

Authors

Keywords

Abstract

Original Publication Citation

BYU ScholarsArchive Citation

Document Type

Publication Date

Permanent URL

Publisher

Language

College

Department

Copyright Status

Copyright Use Information

Included in

Share

Search

Browse

BYU Links

Author Corner

Hosted by the