8 June 2026 · 7 min read
AI text detector: how they work, what they measure, and where they break

Most discussion of AI text detectors focuses on which ones to use and how accurate they are. Rather less is written about how they actually work – which makes it harder to understand why they fail in the ways they do, and harder to use their outputs intelligently. Here's a plain-English explanation of the technology, and why its limitations are structural rather than incidental.
Language models and probability
To understand AI text detection, you need to understand what language models do. At their core, large language models like GPT-4, Claude, and Gemini predict the most likely next token – roughly, the next word or word fragment – given everything that came before. When generating text, they sample from the probability distribution of likely next tokens. The settings controlling how they do this affect how predictable or creative the output is. At default settings, most models produce text that's more statistically predictable than typical human writing: they tend toward the most expected choices.
What perplexity measures
Perplexity is the primary measure AI text detectors use. Technically, it's a measure of how surprised a language model would be by each word choice: low perplexity means the word was highly expected; high perplexity means it was surprising. Averaging this across a text gives a sense of whether it was overall predictable (low perplexity, more AI-like) or unpredictable (high perplexity, more human-like).
The intuition behind using perplexity for detection is straightforward: if AI models produce predictable text and human writers produce less predictable text, measuring predictability should distinguish them. This works as a weak signal. But it's weak for an important reason: human writers vary enormously in how predictable their writing is. A careful academic writer following established conventions produces more predictable text than a novelist experimenting with style. A student writing in a second language may choose more predictable vocabulary than a native speaker. The variation in human writing overlaps significantly with AI output.
What burstiness adds
Burstiness is a second measure some detectors add. It captures the variation in perplexity across a text: does the predictability level fluctuate, or is it relatively uniform? Human writing tends to have higher burstiness than AI output – human writers shift register, switch between technical and informal language, produce some passages that flow easily and others that feel laboured. AI output at default settings tends to be more consistently predictable throughout.
Adding burstiness to perplexity gives detectors a somewhat more robust signal. The limitation is similar to perplexity's: some human writing is consistently formal and predictable (academic writing in a specialist field, for example), and some AI output can be deliberately varied. The measure improves average benchmark performance without removing the fundamental overlap between human and AI distributions.
Why ESL writers trigger false positives
This is the most practically important thing to understand about AI text detection. ESL and EAL writers produce text with lower average perplexity than native speakers, because of how second language acquisition works. Learning a language initially involves using a smaller vocabulary, simpler sentence structures, and more conventional phrasing – all producing more predictable text. Even at advanced proficiency levels, EAL writing continues to score more similarly to AI output than native writing does on perplexity measures.
This isn't a fixable mistake in detector design. It's a consequence of the fundamental signal being used. Predicting text from a language model and predicting text from a non-native writer overlap in the same way, because both reflect learned, somewhat constrained use of probability space in the language.
What detectors cannot see – and what process capture can
Text-based detectors work on finished output and have no visibility into how it was produced. Two documents with identical statistical properties could have been produced by entirely different means: one typed by a student over an hour, the other pasted from a language model in a minute. From a text analysis perspective, they're indistinguishable.
Process-based detection fills this gap by capturing the writing session itself: the timestamps of typing events, the paste events and their sizes, the session duration. This is categorically different information, and it isn't vulnerable to the structural problems of perplexity-based analysis. A student who pasted generated text in two minutes and a student who composed gradually over fifty minutes produce very different process records, regardless of how similar the resulting prose looks.
Try Learnaway with your next homework