9 June 2026 · 7 min read
How to check if text is AI-generated: what actually works

The question sounds straightforward but the answer isn't. Can you reliably tell whether a piece of text was written by AI? Not with the confidence you'd need for a formal accusation. But you can do better than guessing, and understanding what the available tools actually measure helps you use them appropriately rather than over-confidently.
Why this is harder than it looks
Large language models are trained on enormous amounts of human-written text. Their output is, by design, intended to resemble human writing. As models have improved, the resemblance has become closer - which means the distinguishing signals have become weaker. A tool that worked reasonably well against early model output may struggle significantly with more recent generations.
There's also an asymmetry in the error costs. Missing an AI-generated submission is bad; wrongly accusing a student who wrote their work honestly can be considerably worse. Any tool for checking AI-generated text needs to be evaluated not just for its detection rate but for its false positive rate - and in practice, these two figures trade off against each other in ways vendors don't always communicate clearly.
How text-based detection tools work
Most AI content detectors you'll encounter operate on a principle called perplexity. They measure how predictable each word choice is relative to what a language model would generate. AI output tends toward higher predictability; human writing tends toward more variation. Some tools add a second measure called burstiness - the variation in predictability across different parts of the text. These metrics provide a statistical signal, but not a definitive one.
A separate approach involves AI watermarking, where the generating model embeds a detectable statistical signature in its output. This is more reliable when it works, but requires the generating model to support watermarking and is defeated by any post-processing that paraphrases the output. It's not yet widely deployed in the tools teachers encounter.
What free tools are actually good for
Several free AI detection tools exist for individual use - GPTZero, various browser extensions, and free tiers of commercial products among them. These are useful for a quick first-pass, but carry important caveats. Published accuracy claims typically reflect performance on controlled benchmark datasets, not on the real-world mixed-effort content teachers encounter. For non-native English writers, false positive rates are substantially higher than headline figures suggest.
For occasional, low-stakes checks, these tools are a reasonable starting point. For anything that could inform a formal judgement about a student, they are not sufficient evidence on their own. The output of any text-based detector should be treated as one input into a broader assessment, never as a verdict.
Process-based detection: the more reliable approach for education
For educational contexts specifically, the most reliable form of AI text detection doesn't read the text at all. It examines how the text was produced: the session timeline, paste events, typing cadence, and revision patterns that together constitute the writing process. This process fingerprint is language-neutral, harder to fake, and produces evidence that's considerably more defensible in formal proceedings.
Tools that capture this process data do so during the submission itself. The teacher sees a timeline - when typing happened, when paste events occurred and how large they were, when focus was lost - without ever seeing what the student actually typed. It's a different category of information from text analysis, and in educational contexts, a more useful one.
Using the signals you have
Whatever approach you use, the appropriate way to use AI detection signals is as a starting point for enquiry, not a basis for automatic action. A high score from a text-based tool, or an anomalous process record, means: ask the student about their process. It does not mean: conclude that misconduct occurred.
This matters practically as well as ethically. The strength of any misconduct case depends on the quality of the evidence. A conversation in which the student cannot account for their own work, combined with documented process anomalies, is far stronger evidence than a detector score alone. Start with the signals, end with the conversation.
Try Learnaway with your next homework