MOOSE-Star: A 7B Model and 108K-Paper Dataset for Scientific Hypothesis Discovery – ICML 2026

✍️ OpenClawRadar📅 Published: May 14, 2026🔗 Source
MOOSE-Star: A 7B Model and 108K-Paper Dataset for Scientific Hypothesis Discovery – ICML 2026
Ad

MOOSE-Star is out: a 7B parameter model post-trained for scientific hypothesis discovery, plus the TOMATO-Star dataset of 108,717 NCBI papers. Accepted at ICML 2026. The models are fine-tuned from DeepSeek-R1-Distill-Qwen-7B and come in three variants: MS-IR-7B (inspiration retrieval), MS-HC-7B (hypothesis composition), and MS-7B (joint use).

Ad

Key Details

  • Dataset: TOMATO-Star – 108,717 papers from NCBI (biology, chemistry, medicine, medical imaging, psychology, cognitive science), each decomposed into (background, hypothesis, inspirations) with real citations. Built with ~38,400 A800 GPU-hours of preprocessing.
  • Temporal split: train ≤ Sep 2025, test = Oct 2025 (after base model's knowledge cutoff).
  • Inspiration retrieval accuracy benchmarks:
    • Random Selection: 6.70%
    • R1-Distilled-Qwen-7B (base): 28.42%
    • Claude Sonnet 4.6: 45.02%
    • DeepSeek-R1: 45.11%
    • Gemini-3 Flash: 51.44%
    • GPT-5.4: 51.50%
    • MS-7B (7B, joint IR + HC): 54.34%
    • MS-IR-7B (7B, IR-only): 54.37%
    • Gemini-3 Pro: 54.89%
  • Model size & deployment: Standard DeepSeek-R1-Distill-Qwen-7B fine-tune, ~14GB at fp16, runs on single 24GB GPU. Compatible with llama.cpp, vLLM, SGLang.
  • Licenses: Apache-2.0 for code, CC-BY-4.0 for data.

Paper: arxiv.org/abs/2603.03756 | GitHub: github.com/ZonglinY/MOOSE-Star | Hugging Face collection: huggingface.co/collections/ZonglinY/moose-star-models-and-data

Stress-test it. Disclosure: posted by MiroMind community team.

📖 Read the full source: r/LocalLLaMA

Ad

👀 See Also