DeepReinforce has released Ornith-1.0An open-source model family built for agentic coding. The lineup spans four sizes, from the 9B compact model to the 397B mixology-specialist flagship. Each Checkpoint is published under the MIT License on Hugging Face. The models are post-trained on top of the pre-trained Gemma 4 and QUEN 3.5.
Most coding agents attach a model to a fixed, human-designed harness. Instead, Ornith-1.0 learns to write on its own. The DeepReinforce research team reports state-of-the-art results among open models of comparable size.
TL;DR
- Ornith-1.0 ships in sizes 9B, 31B, 35B-MOE and 397B-MOE, built on Gemma 4 and Quen 3.5 under MIT.
- The model learns its own scaffolding during RL, jointly optimizing the harness and the solution.
- Ornith-1.0-397b beats Cloud Opus 4.7 on both headline benchmarks, but not Opus 4.8 or the larger GLM-5.2-744b.
- Three layers – fixed trust limits, deterministic monitors, frozen LLM judges – protection against reward hacking.
What is Ornith-1.0?
Ornith-1.0 is a set of logic models designed for coding agents. The variants are 9b dense, 31b dense, 35b MOE and 397b MOE. The 35B model is a mix of experts and activates approximately 3B parameters per token. FP8 and GGUF builds have also been published for faster local service.
Every model is a logic model. Answers start with a <think> Block before final reply. Serving traces enable an argument parser, so that the trace is returned in a different way. reasoning_content Field. Model agents also emit well-formed tool calls for loops.
Deploying is straightforward. The 9B model is about 19GB in bf16 and runs on a single 80GB GPU. The target serving dishes are VLLM, SGlang and Transformer. Each model exposes an OpenAI-compliant endpoint. So standard agent frameworks work without code changes.
interactive explainer