What I look for now when I read a student essay?

The first time I caught a student using ChatGPT, I missed it. The essay was on Of Mice and Men, the student was an eighth grader who usually struggled to organize a paragraph, and the writing was suddenly clean and confident in a way that should have made me suspicious. Instead I gave it a B+ and wrote “great improvement” in the margin. A colleague flagged it for me a week later. She had seen the same essay, almost word for word, in one of her own classes.

I bring this up because I think a lot of teacher-facing writing about AI right now skips the part where we admit how easy it has been to get fooled. Three years in, I still get fooled sometimes. The tools have gotten much better. The students using them have gotten much better. And the easy stuff we used to teach each other to look for, the delve and the moreover and the suspiciously balanced three-part lists, mostly does not work anymore because students run their drafts through a second tool before they submit.

That second tool is what I want to spend some of this piece on, because I do not think enough teachers know about it yet.

What are AI humanizers?

There is now a whole category of software called humanizers. You give it a ChatGPT draft, it rewrites the prose to sound more like a person, and it tries to pass it through detection software like Turnitin or GPTZero on the way out. Some of these tools are free. Most of them are aimed openly at students. One of the more polished ones, EssayTone, was built by ex-Stanford researchers and markets itself as preserving the student’s ideas and citations while removing the AI fingerprints from the writing. Whether or not you think students should be using these is sort of beside the point. They are, and the better ones do work, at least well enough to slip past automated detection most of the time.

So if the automated detection is unreliable and the easy tells are gone, what is left? Honestly, less than I would like, and what is left is hard to teach.

The thing I find myself relying on most is voice mismatch across assignments. If a student writes journal entries in class that are casual and a little messy, and then their take-home essay arrives sounding measured and a few years older than they are, that is information. It does not prove anything by itself. Kids do level up, especially over a long arc. But if the voice on the page is not the voice I hear when the student talks in discussion, or the voice in their handwritten exit tickets, I read the essay more carefully. I have caught more by comparing assignments to each other than by anything I learned from a detector.

The other thing I notice, when I notice anything at all, is a strange flatness around opinion. Fourteen-year-olds have opinions. They overcommit to one side of an argument and barely tolerate the other side. When I get an essay where the pro and con paragraphs are about the same length and end on the same kind of summary sentence, something is off. It is not proof. It is a flag.

But I want to be careful here, because I have also accused students who were not cheating. Once, memorably, a quiet kid I had underestimated all year handed in a genuinely good piece of writing, and my first instinct was to suspect her. That stung for both of us. So whatever I write here about tells and patterns, I would put them well below the confidence threshold needed to actually accuse a student. They are reasons to ask follow-up questions, not reasons to fail anyone.

Detection is getting harder

Which brings me to the part of this that I have come around on, which is that detection is the wrong center of gravity for the whole problem.

The realistic question is not “did a computer write this.” The realistic question is “did this student do the thinking.” Those are different questions. A student can use ChatGPT for a first draft, do real intellectual work revising it, and end up understanding the material. Another student can write every word themselves while looking at a summary on Reddit and learn nothing. The thing that matters for whether they are getting an education is whether the thinking happened, and detection software cannot tell me that.

What does help, in my experience, is moving more of the writing into places where I can see it. Not all of it. I am not asking teachers to give up on take-home work. But the major assessments I care about, the ones I am actually using to figure out whether a kid can argue and analyze, those work better when at least one drafting session happens in the room with me. Even one class period of pen-and-paper drafting per major essay gives me a baseline of the student’s actual voice, which is useful both as a teaching tool and as a reference point if something later looks off. The American Federation of Teachers has been recommending versions of this for a while now, and they are right.

The other thing that has worked for me is being specific in prompts in ways that AI handles badly. “Write about justice in To Kill a Mockingbird” is a gift to ChatGPT. “Write about Atticus’s courtroom speech in chapter twenty and connect it to a time someone in your family taught you something about fairness” is much harder to outsource. Not impossible. A student can lie about their family. But the lying takes effort, and at that point you are at least asking them to invent, which is a different skill than copying. Local, personal, specific. That is where AI gets weakest.

And then there is the part I underweighted for too long, which is just talking to students openly about all of this. I used to think the strategic move was to be a little vague about how I detected AI writing, to keep students off balance. I have come around. The students who think their teacher does not understand AI are the most likely to use it dishonestly. The students who know their teacher has used the tools, knows what humanizers are, has read what their school’s policy actually says, are the ones who tend to keep it inside the lines I set. Spend a class on it. Show them what a ChatGPT essay looks like. Show them what a humanized one looks like. Tell them which uses you are fine with and which you are not. The clarity helps both of you.

Future is going to be filled with AI writing

I do not have a clean ending for this. The honest version is that I am still figuring it out, the students are still figuring it out, and the tools keep moving. What I have given up on is the idea that there is a detection answer coming that will solve this for us. What I have not given up on is the parts of writing instruction I already believed in before any of this started: drafting in stages, talking through arguments out loud, reading student work next to other student work from the same kid, and not pretending the technology does not exist.

Worksheetzone has a roundup of AI tools for teachers that is worth a look, and the planning and graphic organizer worksheets on the main site are useful for the in-class drafting work I described above. Both have been part of how I have adapted in my own classroom over the last two years.

What are AI humanizers?

Detection is getting harder

Future is going to be filled with AI writing

Related posts

Addition activities for kindergarteners that actually work

Activities for 2 Year Olds: 25 Fun Ideas That Work

Fairy tales for kindergarteners: 12 classic stories to read

Back To School Checklist: Be Ready For The First School Day