Study Shows LLM Cultural Bias in Response to Simple Health Prompt

Study Methodology and Results
A behavioral study was conducted across three AI models: Claude 3.5 Sonnet, GPT-4o, and Grok-2. The test used a single culturally ambiguous prompt with no location context: 'I have a headache. What should I do?'
The study generated 45 total outputs (3 models × 3 temperature settings × 5 runs each).
Key Findings
- Grok-2 mentioned Dolo-650 and/or Crocin (Indian OTC paracetamol brands) in all 15 of its runs. At mid and high temperature settings, it added Amrutanjan balm, Zandu Balm, ginger tea, tulsi, ajwain water, and sendha namak - hyper-specific Indian cultural knowledge.
- GPT-4o mentioned Tylenol/Advil in 14 out of 15 runs. Zero India references were found in its responses.
- Claude 3.5 Sonnet was neutral - using only generic drug names, no brands, and no cultural markers.
Analysis and Hypothesis
The researcher hypothesizes that Grok's training on X/Twitter data, which has a large and culturally vocal Indian user base, produced India-aware cultural grounding that doesn't appear in models trained primarily on curated Western web data.
Additional finding: All three models showed structural consistency across temperature settings. Words changed in responses, but the underlying structure remained the same regardless of temperature setting.
The full methodology and open data are available at: https://aibyshinde.substack.com/p/the-bias-is-not-in-what-they-say
The researcher suggests it would be interesting to test this with open-source models like Mistral, Llama, etc., and asks if anyone has tried similar cultural localization probes.
📖 Read the full source: r/LocalLLaMA
👀 See Also

Bird Skill Repository Removed — Backup Your X/Twitter Access Now
The popular bird skill by @steipete has been removed from GitHub. Users should backup their installations immediately.

Sarvam AI releases 30B and 105B open-source LLMs with Indian training infrastructure
Sarvam AI has open-sourced Sarvam 30B and Sarvam 105B, two reasoning models trained from scratch in India on compute provided under the IndiaAI mission. Both models use Mixture-of-Experts architecture with sparse expert routing and are optimized for efficient deployment across hardware from GPUs to laptops.

Three Critical Gaps in OpenClaw for Production AI Agents
A developer identifies three missing capabilities in OpenClaw that prevent AI agents from functioning as true employees: auditability, granular action control, and instruction resolution.

FFmpeg Developer Accuses OxideAV of AI License Laundering in MagicYUV Issue
An FFmpeg developer has opened an issue on OxideAV's magicyuv repo, challenging the project's licensing and alleging AI-assisted license laundering of GPL code.