Supra-50M-Reasoning: Open-Source Tiny Model with Chain-of-Thought Thinking

SupraLabs released Supra-50M-Reasoning (ThinkSupra-50M), a tiny 50M-parameter model that produces a full chain-of-thought (CoT) before responding. It's the reasoning variant of Supra-50M-Instruct, fine-tuned from Supra-50M-Base using a synthetic dataset of 500 examples generated by Qwen3 1.7B, trained for 6 epochs with SFT in bfloat16. Experimental, prone to hallucination, and fully open.
Inference Format
Every response follows this structure:
<|begin_of_thought|> ... thinking ... <|end_of_thought|> <|begin_of_solution|> ... final answer ... <|end_of_solution|>
Quick Start
import torch from transformers import pipeline, AutoTokenizerMODEL_ID = "SupraLabs/Supra-50M-Reasoning" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, clean_up_tokenization_spaces=False) pipe = pipeline("text-generation", model=MODEL_ID, tokenizer=tokenizer, device_map="auto", torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32)
def build_prompt(instruction, input_text=""): if input_text.strip(): return f"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Input:\n{input_text}\n\n### Response:\n" return f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:\n"
def generate(instruction, input_text=""): result = pipe(build_prompt(instruction, input_text), max_new_tokens=512, do_sample=True, temperature=0.3, top_k=50, top_p=0.9, repetition_penalty=1.15, pad_token_id=pipe.tokenizer.pad_token_id, eos_token_id=pipe.tokenizer.eos_token_id, return_full_text=False) return result[0]['generated_text'].strip()
Sample Output
Prompt: "What is AI?"
Thinking: "Okay, the user is asking about AI. Let me start by recalling what AI is. AI is a subset of machine learning, specifically neural networks..."
Response: "AI is a subset of machine learning that focuses on enabling machines to learn from data... used in healthcare, finance and even in the field of robotics."
What's Next
SupraLabs plans larger models: Supra-124M (Base, Chat, Reasoning) and Supra-350M (Base, Chat, Reasoning, Coding).
Model on Hugging Face: Supra-50M-Reasoning
Dataset: SupraThink-Dataset-500x
📖 Read the full source: r/LocalLLaMA
👀 See Also

Hearth: Self-Hosted Multi-User AI Chat App for Households on OpenClaw
Hearth is a self-hosted household AI chat app built on OpenClaw that provides separate accounts and conversations for each family member, with features including PIN/biometric login, private chats, reminders, and model presets.

Oh-My-Mermaid: Claude Code Skill for Auto-Generating Architecture Diagrams
Oh-My-Mermaid is a Claude Code skill that analyzes codebases and automatically generates Mermaid architecture diagrams and documentation. It's installed via npm and used with the /omm-scan command in Claude Code.

Zikra: Self-Hosted MCP Memory Server for Claude Code, Cursor, and Codex
Zikra is a self-hosted MCP memory server that automatically saves every decision, error, and requirement when Claude Code sessions end via a Stop hook, creating a shared memory pool accessible across tools and team members.

Browser39: A Headless Web Browser for AI Agents
Browser39 is a headless web browser designed specifically for AI agents that converts web pages to token-optimized Markdown locally, runs JavaScript, manages cookies and sessions, queries the DOM, and fills forms. It's a single binary with no external browser needed, no fees, and no external service.