Berkeley Study: All AI Revision Prompts Drift Prose Toward Formality, Even "Preserve Voice"

Tom van Nuenen at Berkeley ran 300 personal narratives through three frontier models (Claude-class, ChatGPT-class, Gemini-class) under three prompt conditions: generic "improve this," generic "rewrite this," and explicitly "revise this while preserving the original voice." He measured 13 stylometric markers in input and output: function words, contractions, first-person pronouns, vocabulary diversity, sentence length variance, punctuation patterns, and emotion words.
The result: every model in every condition drifted in the same direction. The output had fewer contractions, fewer first-person pronouns, greater vocabulary spread, longer words, and more elaborate punctuation. The shift moved prose from embedded narration toward distanced narration. The "preserve voice" prompt only reduced the magnitude of the drift, not its direction.
In plain language: every AI revision prompt makes prose more polite, more formal, more eager to please—even when the prompt says don't.
Implications for Tooling
The paper argues that voice instructions live at a layer the model's post-training distribution overrides within a paragraph or two. Anyone iterating on prompts, sample paste-ins, custom instructions, or character bibles for voiced output (writing, dialogue, marketing copy, persuasive essays) has been working on a problem with a structural ceiling.
It also offers the cleanest empirical explanation for the Claude 4.7 prose regression: 4.7's central voice is more deeply encoded than 4.6's, which is why it reads stylometric structure better (as seen in the Piper experiment) and resists deviation harder (the memo-voice complaints).
Constraint-Based Architecture
The author's recommendation: if you want voice preservation across long-form work, the architecture must live outside the prompt. Compiled style profiles should be applied as binding constraints on every generation — not as prompt parameters that can be overridden. A breakdown of why each major writing tool (Sudowrite, NovelCrafter, Claude/ChatGPT direct) hits the same ceiling, and what a constraint-based architecture looks like in practice, is available at the linked blog post below.
Paper: https://arxiv.org/abs/2604.22142
📖 Read the full source: r/ClaudeAI
👀 See Also

OpenClaw Agent Auto-Edits HEARTBEAT.md, Adds 10 Self-Assigned Tasks
In a default HEARTBEAT.md execution, an OpenClaw agent added 10 self-assigned tasks including system review, memory maintenance, and weather checks — raising token burn concerns.

The Hidden Cost of AI-Generated Code: Debugging Spaghetti
A Reddit post captures the reality of shipping AI-generated code fast — then spending weeks debugging bloated functions, null state bugs, and cryptic variable names.

OpenAI to deploy AI models on U.S. Department of War classified network
OpenAI has reached a deal to deploy its AI models on the U.S. Department of War's classified network, with implementation scheduled for 2026. The Reuters article generated 15 points and 6 comments on Hacker News.

Decoupled DiLoCo: Resilient Distributed Training Across Data Centers with Low Bandwidth
Google DeepMind's Decoupled DiLoCo trains LLMs across distant data centers using 2-5 Gbps WAN, with self-healing islands of compute that isolate hardware failures without degrading ML performance.