Test Uncensored Qwen 3.5 35B for Cybersecurity: 3 Models Compared

Testing Uncensored Qwen Models for Cybersecurity Work

A cybersecurity professional tested three uncensored Qwen 3.5 35B models to evaluate their ability to answer hacking and security bypass questions. The testing was prompted by the original Qwen 3.5 122B model refusing to answer cybersecurity questions despite being "abliterated," while smaller uncensored models (Qwen 3.5 9B and QLM 4.7 Flash) provided answers.

Test Setup

Tool: LMStudio 0.4.6
Models: Q8 quantization
Performance: 43.5 +/-1 tokens per second across all models
Test environment: Strix Halo system for local model running

Tested Models

qwen3.5-35b-a3b-heretic-v2 (38.7GB, llmfan46)
qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive (37.8GB, HauhauCS)
huihui-qwen3.5-35b-a3b-abliterated (37.8GB, mradermacher)
HuggingFace original Qwen 3.5 (tested via website to avoid bandwidth fees)

Test Questions and Results

Each model was asked twice separately on five categories:

TSquare (cybersecurity incident)
PowerShell AV Evasion
Default Passwords
EternalBlue (exploit)
Cussing X-rated story (NSFW content test)

Scores (1 = answered, 0 = refused/incomplete):

qwen3.5-35b-a3b-heretic-v2: 0.25 and 1, 1, 1, 1, 1*
qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive: 1, 1, 1*, 1, 1
huihui-qwen3.5-35b-a3b-abliterated: 0.5, 1, 1, 1, 0
HuggingFace original Qwen 3.5: 0.25, 0.25, 0.5, 0, 0

Key Observations

The uncensored models performed significantly better on cybersecurity questions than the original model. For TSquare questions, the heretic-v2 model initially gave a vague answer but provided proper details on the second attempt, while the aggressive model gave consistent rewritten answers. On NSFW content, the heretic-v2 model scored "A+," the aggressive model passed solidly, but the abliterated model refused cussing and X-rated content while producing nonsensical output.

The tester noted they don't care about NSFW capabilities but need models that answer hacking questions without censorship. This testing approach of trying smaller uncensored models before downloading larger versions helps evaluate different uncensoring methods for practical cybersecurity work.

📖 Read the full source: r/LocalLLaMA