When I sat down to mark undergraduate student essays in the spring of 2023, the hype around ChatGPT was already at giddy heights. Like teachers everywhere, I was worried that students would succumb to the temptation to outsource their thinking to the machine. Many universities, including mine, responded by adopting AI detection software, and I soon had my fears confirmed when it provided the following judgment on one of the essays: 100% AI-generated.
Essays are marked anonymously, so my heart dropped when I found out that the first 100% AI-generated essay I marked belonged to a brilliant, incisive thinker whose essays in the pre-ChatGPT era were consistently excellent, if somewhat formulaic in style.
I found myself in an increasingly common predicament, caught between software products and humans: students and ChatGPT on one side, lecturers and AI detectors on the other. Policy demands that I refer essays with high AI detection scores for academic misconduct, something that can lead to steep penalties, including expulsion. But my standout student contested the referral, claiming university-approved support software they used for spelling and grammar included limited generative AI capabilities that had been mistaken for ChatGPT.
The software that scanned my student’s essay is provided by Turnitin, an American education technology giant that is one of the biggest players in the academic misconduct market. Before ChatGPT, Turnitin’s primary function was to produce similarity reports by checking essays against a database of websites and previously submitted student work. A high similarity score does not always mean plagiarism – some students just quote abundantly – but does make it easier to find copy-and-paste jobs.
Generative AI makes copying and pasting seem old-fashioned. Prompted with an essay question, ChatGPT produces word combinations that won’t show up in a similarity report. Facing a threat to its business model, Turnitin has responded with an AI detection software that measures whether an essay strings words together in predictable patterns – as ChatGPT does – or in the more idiosyncratic style of a human. But the tool is not definitive: while the label announces that an essay is X% AI-generated, a link in fine print below the percentage opens a disclaimer that admits it only might be.
Unlike the similarity report, which includes links to sources so that lecturers can verify whether a student plagiarised or used too many quotations, the AI detection software is a black box. ChatGPT has more than 180 million monthly users, and it produces different – if formulaic – text for all of them. There is no reliable way to reproduce the same text for the same prompt, let alone to know how students might prompt it. Students and lecturers are caught in an AI guessing game. It’s not hard to find students sharing tips online about evading AI detection with paraphrasing tools and AI humanisers. It’s also not hard to find desperate students asking how to beat false accusations based on unreliable AI detection.
When my student contested the AI detector’s judgment, I granted the appeal. I admit to trusting the human over the machine. But the defence was also convincing, and this particular student had been consistently writing in this style long before ChatGPT came into being. Still, I was making a high-stakes call without reliable evidence. It was a distressing experience for my student, and one that is being repeated across the sector.
Many academics have translated the hype around AI to a heightened suspicion of students. And it’s true that ChatGPT can plausibly write mediocre university-level essays. A combination of ChatGPT and AI humanisers might even carry someone through university with a 2:2.
But if universities treat this as an arms race, it will inevitably harm students who rely on additional support to survive a system that is overwhelmingly biased to white, middle-class, native English speakers without disabilities, and whose parents went to university. Students who don’t fall into those categories are also more likely to turn for support to spelling and grammar checkers like Grammarly, which also uses generative AI to offer stylistic suggestions, putting them at risk of running foul of AI detectors even when the substantive ideas are original. Innocent students will inevitably find themselves in a kind of Kafkaesque computational scenario – accused by one automated software of improperly relying on another.
What is to be done? In the desperate – and largely futile – scramble to catch up with AI, there is a real danger that academics lose sight of why we assign essays in the first place: to give students the opportunity to display their ability to evaluate information, think critically and present original arguments. This may even be an opportunity to move away from the sort of conventional essay questions that can so easily be fed into ChatGPT. Students can present original, critical work in presentations, podcasts, videos and reflective writing.
It’s also possible to ask students questions that include information that doesn’t exist in ChatGPT’s training data – for instance, by incorporating content generated in class. Lecturers could also address ambient AI anxiety head-on by prompting ChatGPT with assigned essay questions and asking students to critique the resulting output in class. The goal need not be just to fend off AI-generated essays: expanding the range of assessments can also help universities close the achievement gap that exists in part because traditional forms of assessment tend to favor more privileged students.
Of course, all of this puts more pressure on casualized, overworked staff, which is why the kneejerk response to revert to closed-door, handwritten exams is understandable, if misguided. We can be critical of AI, but we can’t pretend it doesn’t exist if we want to prepare students for a world where humans will have to live and work alongside thinking machines.
Achieving that goal would be easier if students arrived at university as open-minded critical thinkers instead of stressed-out, debt-burdened consumers. In this sense, the panic around AI is only the latest symptom of a broader crisis at UK universities, and it is a crisis that is not equally felt. The Conservative government’s move to force universities to cap admissions on low-value degrees – cutting off their primary funding source – is also an attack on working-class and minority-ethnic students. Responding to AI with punitive measures based on unreliable detection software risks contributing to that same attack. If there is any chance of avoiding the bitter irony of the biggest technological breakthrough since the internet entrenching enduring inequalities in education, it will come from lecturers working with AI, instead of joining a losing fight against it.