GGUF Model Merging Script and Workflow for Qwen3.5-35B Variants

✍️ OpenClawRadar📅 Published: April 1, 2026🔗 Source
GGUF Model Merging Script and Workflow for Qwen3.5-35B Variants
Ad

A Reddit user has shared a Python script and workflow for merging GGUF model files with minimal loss, specifically targeting Qwen3.5-35B variants. The approach combines two existing models: HauhauCS's Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive and samuelcardillo's Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF.

Technical Details

The merged model is available as a Q4_0 quantized version at Hugging Face. According to the source, samuelcardillo's finetune outperforms Jackrong's version for Qwen 3.5 35B.

Merging Workflow

The Python script (available on Pastebin) was "vibecoded via Claude Opus 4.6" and supports:

  • Merging GGUF files on Google Colab Free Tier
  • Quantization via llama-quantize
  • Q4_K_M quantization for 35B models
  • Q8 quantization for 8B models

The author notes they can't create Q8_0 or F16 quantized versions due to disk space limitations on Google Colab Free tier, but suggests others can tweak the script via Claude Opus for those quantizations.

Ad

Optimal Settings

For best performance in LM Studio, use these parameters:

Temperature: 0.7
Top K Sampling: 20
Presence Penalty: 1.5
Top P Sampling: 0.8
Min P Sampling: 0
Seed: 3407 or 42

The system prompt (full version on Pastebin) should include this first line: "You are Qwen, created by Alibaba Cloud. You are a helpful assistant." The author notes the model underperforms without this line.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also

Reddit user shares detailed prompt for exporting personal knowledge from AI assistants
Tools

Reddit user shares detailed prompt for exporting personal knowledge from AI assistants

A Reddit user has created a comprehensive prompt for extracting structured personal knowledge from AI assistants like Claude, addressing perceived limitations in Anthropic's ChatGPT import feature. The prompt generates three distinct JSON artifacts covering personal knowledge bases, intellectual frameworks, and knowledge graphs.

OpenClawRadar
Reverse-engineering UniFi inform protocol for multi-tenant routing
Tools

Reverse-engineering UniFi inform protocol for multi-tenant routing

The UniFi inform protocol sends device data to controllers via HTTP POST on port 8080 every 10 seconds. The first 40 bytes of each packet contain unencrypted device MAC addresses, enabling routing without decryption.

OpenClawRadar
Cloudflare's AI Platform: Unified Inference Layer for AI Agents
Tools

Cloudflare's AI Platform: Unified Inference Layer for AI Agents

Cloudflare's AI Platform provides a single API to access 70+ models across 12+ providers, including multimodal support for image, video, and speech models. It enables switching between models with one-line code changes and offers centralized cost monitoring with custom metadata.

OpenClawRadar
Rukuzu: Porting a 200,000 Line C++ Graph Database to Rust with Systematic Testing
Tools

Rukuzu: Porting a 200,000 Line C++ Graph Database to Rust with Systematic Testing

The Rukuzu project describes a workflow for porting the 200,000-line C++ kuzu embedded graph database to Rust, using a Claude Code custom command to maintain both versions simultaneously and verify correctness through 2,700+ tests.

OpenClawRadar