Dash it all

Erin Kernohan-Berning

11/11/20254 min read

a close up of a metal grate on a table
a close up of a metal grate on a table

"It is difficult indeed—it is distressing.—One does not know what to think." – Jane Austen, Pride and Prejudice, Chapter XVIII

In previous articles I’ve talked about spotting AI generated images and videos, but I haven’t talked much about spotting AI generated text. This is because, by and large, there isn’t any sure-fire way of doing this well or reliably. However, that doesn’t stop people from trying to find some way to easily discern human writing from machine writing – whether it’s correct or not.

Take the em dash. At some point this year – I’m not sure when – people began arguing that the em dash was a no-fail way of figuring out if something was written by ChatGPT or the like. But what is an em dash?

The em dash is a dash that is as wide as a capital letter is high. The em as a unit in typography goes back to physical typesetting, where metal spacers called quads were used to create spaces between words. An em quad was as wide as the height of the typeface, and an en – which is where the name for the slightly smaller but similarly used en dash comes from (and the one that I use mostly) – was as wide as half the height of the typeface.

Em dashes are a versatile, much loved, and oft used – sometimes excessively used – piece of punctuation in writing. They can be used like a semicolon to create a pause, or like parentheses to create an aside, or like an ellipsis to make a sentence end prematurely – but the em dash also has a somewhat different effect than any of these. It’s kind of punchier and less subtle. Where my eyes might kind of drift by a semi colon, they take notice of an em dash.

Jane Austen was particularly fond of the em dash. If you go to Project Gutenberg, open up a copy of Pride and Prejudice, and search for ‘—' you’ll find more than 600 of them throughout the three volumes of that work. Since Jane Austen’s works were published mostly between 1796 and 1817 (with the exception of some posthumous publication), we can be sure that her use of em dashes had nothing to do with ChatGPT.

But are em dashes really over-represented in AI generated text? At this point, aside from some forum posts and internet opining, this seems to be just anecdotal. The idea that em dashes are LLM favoured punctuation or the six fingered hand of synthetic text could be a product of confirmation bias – people are suddenly aware of em dashes because they saw a handful in some ChatGPT output and are now seeing them everywhere.

Nevertheless, in the absence of any reliable way of spotting AI generated text, calling out the use of em dashes has turned into internet shorthand for “I don’t believe this is real” – and writers who have been using em dashes all this time bear the brunt of those false accusations.

Unfortunately, there’s no easy “gotcha!” when it comes to synthetic text. AI detection software has been shown to be inconsistent and biased against people who are neurodivergent and those for whom English isn’t their first language. Common words, punctuation marks, and bulleted lists (another emerging “tell”) aren’t particularly useful either – they are merely features of language that we use, and that LLMs have been trained to parrot back to us.

For trying to spot AI generated text, GetCyberSafe.ca advises looking for a dry tone and emotionless formal writing (though, a lot of institutional writing has those features), and repetition of words (again, also a human trait). Fact checking can also help – AI models, even ones designed to search the internet, are still heavily influenced by their training data. Older training data means newer facts may be absent from their results. Other clues that something is AI generated are if the AI prompt or preamble are mistakenly left in the body of text, misattributed or made-up references, or passages with odd word orders.

When humans see written language, we generally have the understanding that there’s another human behind it. Now that we can’t be sure of that, we want there to be a way to know if there is really a human there. Generative AI has altered a fundamental assumption of online interactions – that there is someone—anyone—behind the words we see on the screen. And while that’s certainly a valid reason to take a closer look at what we are reading online, don’t let the search for machines blind you to the humans that are still out there – even if they are using em dashes.

Learn more

AI Detectors Don't Work. Here's What to Do Instead. (MIT) Last accessed 2025/11/11

AI-Detectors Biased Against Non-Native English Writers 2023. (Stanford University) Last accessed 2025/11/11

Evaluating the efficacy of AI content detection tools in differentiating between human and AI-generated text 2023. Elkhatat, A.M., Elsaid, K. & Almeer, S. (Int J Educ Integr 19, 17) Last accessed 2025/11/11

Why Don't AI Detectors Work? (Illinois State University) Last accessed 2025/11/11

I Tested 30+ AI Detectors. These 10 are Best to Identify Generated Text. 2025. Anangsha Alammyan (Medium) Last accessed 2025/11/11

AI vs academia: Experimental study on AI text detectors' accuracy in behavioral health academic writing 2025. Popkov AA, Barrett TS (Account Res) Last accessed 2025/11/11

The Problems with AI Detectors: False Positives and False Negatives 2025. (University of San Diego) Last accessed 2025/11/11

What is the "AI em dash?" 2025. Rachel S. Hunt (Substack) Last accessed 2025/11/11

Cannot get responses to not include dashes and em dashes 2024. (Open AI Developer Community) Last accessed 2025/11/11

Why do AI models use so many em-dashes? 2025. Sean Goedecke (seangoedecke.com) Last accessed 2025/11/11

Pride and Prejudice 1813. Jane Austen (Project Gutenberg) Last accessed 2025/11/11

Dash (Wikipedia) Last accessed 2025/11/11

Recognize artificial intelligence (AI): 9 ways to spot AI content online 2024. (GetCyberSafe) Last accessed 2025/11/11

Correction log

Nothing here yet.