LLM Quasi-Determinism: How AI Slop Reveals Itself

In a recent Substack post, lcamtuf (the security researcher known for AFL and other tools) tackles a recurring debate: whether you can distinguish human-written text from LLM output. His argument is grounded in a concrete observation about how current models behave in practice.

The Core Claim: Quasi-Determinism

LLMs are state-of-the-art statistical models of human language. In theory, their output should be indistinguishable from human text under any statistical test. But lcamtuf argues that the real distinguishing feature is quasi-determinism: give a hundred 'authors' a similar prompt — say, 'generate a reference book for children' — and the model will produce functionally identical output about 80% of the time.

He illustrates this with a collage of ~220 Amazon book covers from a search for '100000 whys' (link). The image shows clusters of nearly identical covers:

The top two rows all feature a roaring T-Rex on the left
Recurring motifs: red-and-white cartoon rocket, golden retriever, lion
Author names include an improbable number of 'Brights': Ethan, Nolan, Pamela, Daniel, Thomas, Andrew W., Mayan, Mary, Levi — all Bright

Why This Matters for Developers

For teams shipping AI-generated content or building on LLM APIs, the implication is that you can't rely on randomness to mask AI origins. The statistical signature isn't about individual word choices — it's about the model returning the same high-level response structure to similar prompts. If your workflow involves generating many variations from similar prompts, the output will cluster, making it easy to spot.

lcamtuf notes: 'This is a fuzzy signal, so you shouldn't fire your intern when they say "it's not this — it's that". But in more casual settings, it's OK to trust your gut.'

Practical Takeaway

If you're using an LLM to automate blogging, be aware that your content may end up looking exactly like everyone else's. The post's P.S. is blunt: 'yes, the tech is amazing, but chances are, your publication could be renamed to "100,000 Whys".'

The post also links to examples beyond this single title (more examples) and notes that the original 'One Hundred Thousand Whys' is a 1929 Soviet children's book popular in China, which likely seeded the prompt term.

📖 Read the full source: HN LLM Tools

The 100,000 Whys of AI: How Quasi-Deterministic LLM Output Creates Telltale Slop

The Core Claim: Quasi-Determinism

Why This Matters for Developers

Practical Takeaway

👀 See Also

Codex Converses: OpenClaw's Successor in AI Automation

Claude-Code v2.1.97 Release: NO_FLICKER Improvements, Permission Fixes, and MCP Updates

Claude Desktop v1.1.5749 Adds Computer Control and Corporate Proxy Fixes

Grammar-Based Method Matches or Outperforms AI in Authorship Analysis