Checking the Plagiarism Checkers
What those Similarity Reports might not be telling you…
Plagiarism is not a new phenomenon in education, but the widespread use of plagiarism checking tools by students, teachers, and institutions is fairly recent. It used to be up to teachers to discover plagiarism. Now, computer software and internet sites detect it – right?
Problems with Similarity Reports
A simple internet search for “plagiarism checker” returns thousands of listings of tools and resources promising, often for a fee, to help determine if a text contains plagiarism or not. Students, both online and on campus, are often required to run their assignments through an automated plagiarism checker before submitting them for grading. Plagiarism detection services typically issue a report that includes an overall percentage indicating how much of a text is unique (original) to the student and/or what percentage is similar (maybe plagiarized) to other texts against which it was checked. However, if teachers take those percentages at face value and trust them without question or further inquiry, then they are at risk of being misled and misinformed.
On plagiarism checking reports, an originality percentage indicates how much of an assignment is unique and written by the student and a similarity percentage shows how much of the text matches whatever it was compared to from the internet and/or a database. Reports often highlight or underline the matched portions and include URLs or hyperlinks to the sources it matches. The problem is that a plagiarism checker cannot check a text against everything that has ever been written or published and often does not always detect all matches to texts on the internet or in its database. Such unreliability was the result of a test I conducted using various plagiarism detection tools. Because my purpose is to demonstrate overall unreliability in general and not to review or rate specific resources, I am not including the names of the tools I used.
For my test, I compiled a 250-word text on the American Civil War, of which only 26% (68 words) was original with me and the remaining 74% (190 words) was copied from three websites and one scholarly article in JSTOR. I ran my essay through six plagiarism detection services and received the following reports.
-
100% unique, 0% similar
-
60% unique, 40% similar
-
74% unique, 26% similar
-
43% unique, 57% similar
-
43% unique, 57% similar
-
50% unique, 50% similar
As you can see, none of them detected the correct percentages of unique and similar material and one of them indicated that none of my text matched anything. What was even more surprising was that none of them detected the sentence I copied from a scholarly article in JSTOR. That prompted me to run another test through those same plagiarism detecting services using 90 words from only the JSTOR article and nothing from websites. The results of that test were that four reports said the text was 100% unique and 0% similar, one said it was 75% unique and 25% similar, and another said it was 67% unique and 33% similar. What all the reports should have said was that it was 0% unique and 100% similar. In other words, they performed better at detecting matches to websites than to closed-access scholarly journals in an online repository.
Solutions to the Problems of Similarity Reports
So what are teachers to do in light of the unreliability of plagiarism checking resources? For those who teach at schools that require the use of plagiarism detecting systems, consider the following ways you can reduce the temptation and incidence of plagiarism and detect it on your own without relying solely on reports generated by whatever detection service is being used.
-
Compose your own test texts and run them through whichever plagiarism software you are required to use to see how reliable it is. Also, do an internet search for free plagiarism checking and run your text through those services to test their reliability and see for yourself.
-
Continue educating your students about what plagiarism is and why they shouldn’t do it.
-
Closely examine the reports generated by plagiarism detection services. Don’t look only at the percentages, but look at what is highlighted and why.
-
Change your assignments a little each time you teach a course so that the work of previous students will not be of use to future students.
-
Contemplate stipulating which websites and/or articles students should use for each assignment. That way, you will already be familiar with the sources they will use and can detect on your own if something was quoted, paraphrased, summarized, and properly cited and referenced.
-
Communicate with your students, in writing, in non-graded settings (email, discussion forums, etc.) so that you become familiar with their writing style and proficiency.
Although some sort of automated plagiarism detection is likely to remain a part of education and does save a teacher time during the grading process, it currently is not a fool-proof way to detect plagiarism and cannot be relied upon to give accurate reports.