Original Research
We Ran 861 Human-Written Sentences Through GPTZero. 13.8% Were Falsely Flagged as AI.
A controlled benchmark on four corpora of verified pre-LLM human writing — Wikipedia, PubMed, news, and Reddit ESL. GPTZero flagged 13.8% of sentences as AI. News and journalism prose, the most-published category of human writing, was flagged the most. Full per-corpus breakdown, top falsely-flagged examples, and a downloadable raw CSV.