What do We Know About the Reliability of AI Detection Tools?

AI, Personal Statements

24th September 2023

Speak right now to our live team of English staff
What do We Know About the Reliability of AI Detection Tools? - Oxbridge Personal Statements

Artificial intelligence is progressively infiltrating our daily routines, and academia is no exception. With tools such as ChatGPT, you can now generate all sorts of texts, including personal statements, in a matter of seconds. This has brought much concern to educational institutions, and professors are now constantly on the lookout for students who are using AI. With the rise of AI detection software, such as GPTZero, OpenAI, and Turnitin, professors now have a seemingly reliable method for detecting AI use. However, this software comes with its own set of problems. Read on to learn more.

AI detection is far from accurate

Professors interested in upholding academic integrity are drawn to the advantages of AI detection tools. These tools perform a detailed, sentence-by-sentence analysis of text and assign scores based on how much text was written by AI. They seem beneficial because their implantation can deter students from deciding to use AI when completing coursework and applying to universities. However, the accuracy of these tools is far from ideal.

The most important problem with AI detection tools is that they have high false positive rates. This means they are likely to identify human-written text as being written by AI, even if no AI was used to generate content. Some AI detection companies, such as Turnitin, claim their false positive rate is only 4%. Although this percentage seems to indicate high accuracy, this isn’t really so. If a university checks 3000 texts, this means that 120 texts will be labelled as AI-generated even though they are not. This is not a small number at all.

Inaccurate AI detection can have dire consequences

When an AI detection tool results in a false positive result, professors are likely to incorrectly accuse a student of cheating. In the last few months, many students have been accused of using AI even though they have written their papers themselves, which has caused them great stress and anxiety (Fowler, 2013). At a particular difficulty are non-native English students, whose texts are more likely to result in false positive detection of AI use (Sample, 2023). These students, like many others who did not actually use AI, are in a dire problem since proving that a text was not written by AI can be challenging.

A key message

 This blog post does not suggest that you should use AI and not worry about AI detection since AI detection tools are inaccurate. Even though these tools are unreliable, professors still have valid means of detecting whether a personal statement was written by AI. The key message of this post, instead, is that you should not worry too much if you run your personal statement through an AI detection software and learn that it was “AI-generated”. A false positive result can and, most definitively, will occur sooner or later. In the subsequent blog, we will explore how to prepare yourself and your personal statements in advance to avoid being a victim of false positive rates and the resulting consequences.

References

Fowler, G. A. (2023, April 3). We tested a new ChatGPT-detector for teachers. It flagged an innocent student. Retrieved September 11, 2023, from https://www.washingtonpost.com/technology/2023/04/01/chatgpt-cheating-detection-turnitin

Sample, I. (2023, July 10). Programs to detect AI discriminate against non-native English speakers, shows study. Retrieved September 11, 2023, from https://www.theguardian.com/technology/2023/jul/10/programs-to-detect-ai-discriminate-against-non-native-english-speakers-shows-study