Supra-50M-Reasoning：オープンソースの超小型推論モデル、CoT対応

SupraLabsはSupra-50M-Reasoning（ThinkSupra-50M）をリリースしました。これは、応答前に完全な思考連鎖（CoT）を生成する、わずか50Mパラメータの小型モデルです。Supra-50M-Instructの推論バリアントであり、Qwen3 1.7Bが生成した500サンプルの合成データセットを使用し、bfloat16でSFTにより6エポックトレーニングされたものです。実験的で、幻覚を起こしやすく、完全にオープンです。

推論フォーマット

すべての応答は以下の構造に従います：

<|begin_of_thought|> ... 思考 ... <|end_of_thought|> <|begin_of_solution|> ... 最終回答 ... <|end_of_solution|>

クイックスタート

import torch
from transformers import pipeline, AutoTokenizer

MODEL_ID = "SupraLabs/Supra-50M-Reasoning"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, clean_up_tokenization_spaces=False)
pipe = pipeline("text-generation", model=MODEL_ID, tokenizer=tokenizer, device_map="auto", torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32)
def build_prompt(instruction, input_text=""):
    if input_text.strip():
        return f"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Input:\n{input_text}\n\n### Response:\n"
    return f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:\n"
def generate(instruction, input_text=""):
    result = pipe(build_prompt(instruction, input_text), max_new_tokens=512, do_sample=True, temperature=0.3, top_k=50, top_p=0.9, repetition_penalty=1.15, pad_token_id=pipe.tokenizer.pad_token_id, eos_token_id=pipe.tokenizer.eos_token_id, return_full_text=False)
    return result[0]['generated_text'].strip()

出力例

プロンプト：「AIとは何ですか？」

思考：「ユーザーがAIについて尋ねています。まずAIとは何かを思い出しましょう。AIは機械学習のサブセットであり、特にニューラルネットワークです...」

応答：「AIは機械学習のサブセットで、機械がデータから学習することを可能にすることに焦点を当てています...医療、金融、ロボット工学の分野で使用されています。」

今後の予定

SupraLabsはより大きなモデルを計画しています：Supra-124M（ベース、チャット、推論）およびSupra-350M（ベース、チャット、推論、コーディング）。

Hugging Face上のモデル：Supra-50M-Reasoning
データセット：SupraThink-Dataset-500x

📖 出典： r/LocalLLaMA

Supra-50M-Reasoning：チェーン・オブ・ソート思考を備えたオープンソースの小型モデル

推論フォーマット

クイックスタート

出力例

今後の予定

👀 See Also

LORE.md: AI会話から構造化された知識を抽出するためのオープンスタンダード

シールド：Claudeコード向けオープンソースセキュリティプラグイン - 統合スキャン機能付き

OpenClawを常時稼働AIアシスタントとして設定する

Dart AI生産性アプリのレビュー：OpenClaw統合版