LLMs can identify anonymous forum users with 68% accuracy at 90% precision

How the de-anonymization works
A research team gathered thousands of posts from anonymous forums like Hacker News and Reddit, then asked language models to identify the authors. They used Hacker News profiles connected to LinkedIn as ground truth, anonymized them, and fed them to AI systems.
The AI was given prompts like: "Which candidate is the same person as the query? Consider overlapping traits like location, profession, hobbies, demographics, and values. A match should share multiple distinctive traits, not just one or two common ones."
Key findings from the study
- Models identified 68% of anonymous users with 90% precision
- This compares to "near 0% for the best non-LLM method"
- Gemini and ChatGPT completed the task in minutes versus hours for humans
- The research shows "practical obscurity protecting pseudonymous users online no longer holds"
What AI can extract from anonymous posts
The models don't just look for explicitly stated personal details. Researchers provided examples of what can be inferred from years of comments:
- Location (Nelson, British Columbia, Canada)
- Profession (pediatric nurse)
- Demographics (woman, married, two daughters)
- Possessions (owns a Prius)
- Hobbies (plays Stardew Valley, fan of Critical Role)
- Preferences (supports nuclear energy, celiac, does not like cilantro)
- Behavioral patterns (visits Berlin subreddit, uses British spelling, accidentally wrote a "¿" in English text)
Implications for online privacy
According to researcher Daniel Paleka from ETH Zurich: "People sometimes express their opinions through pseudonymous accounts, assuming that those opinions will remain private. The existence of a mechanism to investigate or monitor with large language models that allows us to simply ask about a person's beliefs, political opinions, insecurities, or anything else that can be extracted from their anonymous Reddit account, for example, could disempower many people today."
Paleka notes that models can provide a timeline of a person's life if there's sufficient information online, and warns: "Keep in mind that everything you post stays on the internet and can become the target of future models" that will be even more effective.
📖 Read the full source: HN LLM Tools
👀 See Also

Why Internal RAG and Doc-Chat Tools Fail Security Audits
Community discusses real-world security and compliance blockers that prevent RAG tools from reaching production.

FastCGI: 30 Years Old and Still the Better Protocol for Reverse Proxies
FastCGI avoids HTTP desync attacks and untrusted header issues by using explicit message framing and separate parameter channels, making it a safer choice for proxy-to-backend communication.

Student contributes two security patches to OpenClaw production system
A student developer fixed a 'fail-open' vulnerability in OpenClaw's gateway logic (PR #29198) and a tabnabbing vulnerability in chat images (PR #18685), with both patches landing in production releases v2026.3.1 and v2026.2.24 respectively.

Claude's Security Review Command Has Limitations for Production Systems
A developer found Claude's security review command helpful for basic validation like MIME types and file size limits, but insufficient for production hardening against sophisticated threats. The solution required a two-week architectural overhaul separating file processing into a restricted worker with limited permissions.