Artificial-intelligence-based text generators and education

iStock image from Pixabay.

Text-producing artificial intelligence (AI), Chat GPT in particular, has been a hot topic of discussion in the media as well as among teachers and educators the last few months. At first glance, the text produced by ChatGPT is so good that apparently many teachers’ gut reaction was “wow!” followed by “have my students done their assignment themselves, or is it ChatGPT’s output they’re submitting?” As a consequence, one of the big questions in these discussions has been “do teachers have any way of identifying text produced by AI?” In this blog, I use the acronym AI to mean text generated by any AI-based text-generated bot in general, not only ChatGPT. So, can ChatGPT’s current text output be made sufficiently human with suitable inputs from the user?

In order to attempt to put together some kind of response to this question, let’s first think about how people write. The first things that come to my mind are spelling and grammatical errors, and typos. We write and rewrite as our thoughts crystalise into words, changing words and sentence structure as we think and go along. This process is probably responsible, at least in part, for what Edward Tian, a senior student at Princeton University and the developer of GPTZero, a chatbot that detects AI-written text, calls ‘perplexity’ and ‘burstiness’ in human writing. Perplexity refers to the complexity of the text: in the context of this AI, how familiar the text is to the AI. Burstiness relates to the variability in the sentences with respect to sentence length as well as their complexity. He quantified perplexity and burstiness and used them to distinguish human-written text from ChatGPT-written text and provide a probability in GPTZero’s result: highly fluctuating perplexity and burstiness scores indicate a human writer whereas a reasonably constant score of these quantities points to AI. Turnitin, the similarity-checking system currently used at Aalto University, will have the capability to indicate the probability of whether AI-generated text is included in submitted text sometime in April.

Then, how does AI produce text? My layperson’s rough understanding is the following. AI uses a model, say, a generative model (or language model) to deduce what word or token follows an input token. To be able to deduce this, the AI goes through a training phase going through a vast and diverse corpus of text where it is guided to find statistical connections between tokens, and it learns the probabilities that a token or tokens follow a given token or tokens in different connections or contexts. So, when a user of the AI provides it with input, say, in the form of a question, it digs out the most probable token or set of tokens that follow the query in this context. The tokens, be they a word, clause, or phrase, are borrowed from the sources in the corpus used during training. The larger and more diverse the corpus used to train the AI, the better it is at providing a good-sounding and even correct answer. An AI that continually ‘learns’ from queries made to it keeps getting better at providing good responses.

Three of DALL-E’s renditions of the input “programming code, digital space, brain, printed circuit board”.

I attended a Turnitin-hosted webinar on 28 February where Robin Crockett from the University of Northampton spoke about AI-generated text in his talk titled “AI: Friend or foe?” There he aptly said that AI doesn’t write, it generates text. The text it generates is not insightful; it contains no critical evaluation of the topic. It simply parses together text that someone else has written, not create original text. So, in this respect, AI-generated text is plagiarised from the zillion texts in its database (the corpus used in its training). He gave some tips that we humans can use to ‘detect’ AI-generated text:

  • Look out for non-sequiturs: clauses, phrases, or sentences that have no logical connection to the previous statement.
  • Sentences or paragraphs may be repeated, just rewritten in a different form.
  • Expect to find ‘flowery’ and archaic words in unusual sentence constructs.
  • You may find grammatically correct sentences that make no sense or say nothing.
  • A cocktail of American and British English will be common. You will find the two forms of English used in the text in a way that is not typical within the same text. Detecting this use in a setting where neither the teacher nor the student is a native speaker of English will be challenging.

I put this stump-of-a-list here because of the gut reaction I mentioned above to address the general fear that students may pass off AI-generated text as their own. Will they? Unfortunately, some will. However, many, probably the majority, will use AI to create an initial coarse draft that they refine into their own text or to acquire additions to the substance that they did not have initially or that they didn’t remember. This approach, in fact, is a good thing and should be embraced in a controlled manner, since it enhances and broadens students’ learning. They should, in accordance with the practices of academic integrity, cite the use of the AI in question (either as a co-author or a citation). How exactly AI should be cited is an ongoing discussion. Then, there may be a small minority of students who may not use AI. Students should be given guidelines on how to use AI when doing coursework. Naturally, they should be also told the obvious, that passing off AI-generated text deceptively as their own is plagiarism and so unacceptable and that by doing so they won’t be doing favours to their own learning. We’re already using AI to correct our text using grammar-checking services that suggest corrections and concise forms of our text and when seeking information via search engines. Text-generating technology is the next level of evolution of text-correcting technology.

Crockett also gave a few general tips on how to create assignments where the current AI isn’t of much use:

  • Set up assignments related to very specific niche fields and topics. These have fewer texts available on the internet from which the AI can ‘learn’.
  • Ask for insights and critiques. For now, AI is terrible at this task.

We were told in the webinar I mentioned that Turnitin has been working with ChatGPT for the last two years to develop an AI-based algorithm for this purpose. I presume that, at least in the near future, this is going to be an important feature in future in all commercial similarity-detection systems. However, in time, as the AI models and algorithms improve, and the text-generating bots learn to judiciously introduce errors in their text in this cat-and-mouse race, the odds are probably in favour of the text-generating AIs.

Universities have already chosen not to forbid the use of AI in education but to provide guidelines within which the use of AI in coursework is acceptable and even encouraged. Many educators see that students will need to know how to use AI to write text in their professional lives. Aalto University hasn’t yet published such guidelines, but they are in the pipeline. Although I speak primarily about the use of AI by students, the issues addressed here apply to academics in general.

This discussion has only begun…

P.S. I assure you that, even though I was tempted to use ChatGPT to write this blog, I refrained from doing so. However, I did use text-checking AI when writing, and it didn’t find all my errors. I also disagreed with some of its correction suggestions.

Sources I used when writing this blog: