Small Models (3/6): Planner + Doer = Genius
794 words • 4 min read • Abstract

27 million parameters beats o3-mini on ARC.
The hardest reasoning benchmark. Most LLMs score under 5 percent. This tiny model scores 40 percent.
This is Part 3 of the Small Models, Big Brains series, exploring the Hierarchical Reasoning Model (HRM)—a brain-inspired architecture that separates planning from execution.
| Resource | Link |
|---|---|
| Paper | Hierarchical Reasoning Model |
| Original Code | sapientinc/HRM |
| Visualization | viz-hrm-ft |
| Video | Planner + Doer = Genius![]() |
| Comments | Discord |
The ARC Challenge
The Abstraction and Reasoning Corpus (ARC) tests:
- Abstract reasoning
- Pattern matching
- Spatial logic
- Puzzles requiring real thinking
These aren’t problems you can memorize. Each puzzle is unique, requiring genuine understanding of the underlying pattern.
Why LLMs Struggle
| Challenge | LLM Limitation |
|---|---|
| Novel patterns | Can’t rely on training data |
| Spatial reasoning | Text-based thinking is linearized |
| Multi-step logic | Each step compounds errors |
| Abstraction | Pattern matching isn’t generalization |
Meet HRM: The Hierarchical Reasoning Model
HRM uses just 27 million parameters but achieves remarkable results by mimicking how the brain thinks: plan first, then act.
Two-Module Architecture
┌─────────────────────────────────────┐
│ PLANNER │
│ Thinks slow and abstract │
│ Sets goals and strategies │
└─────────────┬───────────────────────┘
│ Goals
▼
┌─────────────────────────────────────┐
│ DOER │
│ Moves fast │
│ Takes concrete actions │
└─────────────────────────────────────┘
| Module | Speed | Function |
|---|---|---|
| Planner | Slow | Abstract thinking, goal setting |
| Doer | Fast | Concrete actions, execution |
This mirrors the brain’s dual-process theory: System 1 (fast, intuitive) and System 2 (slow, deliberate).
Results
| Benchmark | HRM (27M) | o3-mini | GPT-4 |
|---|---|---|---|
| ARC | 40% | <40% | <5% |
| Hard Mazes | 99% | - | ~0% |
| Complex Sudoku | 99% | - | - |
A 27M parameter model outperforming models 1000x larger on reasoning tasks.
The Visualization Tool
The viz-hrm-ft repository provides a React app to visualize HRM’s reasoning process:
- Watch the Planner form strategies
- See the Doer execute actions
- Visualize the feedback loop between modules
- Simulate fine-tuning on BabyAI tasks
git clone https://github.com/softwarewrighter/viz-hrm-ft
cd viz-hrm-ft
npm install
npm start
Why Hierarchy Works
Traditional Flat Models
Input → [Single Network] → Output
Everything happens in one pass. Complex problems overwhelm the network.
Hierarchical Models
Input → [Planner] → Strategy
↓
Strategy → [Doer] → Action
↓
Action → [Environment] → Feedback
↓
Feedback → [Planner] → Refined Strategy
↓
...
The Planner doesn’t worry about details. The Doer doesn’t worry about strategy. Each module focuses on what it does best.
Key Insights
-
Separation of concerns scales. Splitting planning from execution lets each module specialize.
-
Iteration enables refinement. The Planner-Doer loop allows course correction.
-
Small can beat big. 27M parameters with good architecture beats 100B+ with brute force.
-
Brain-inspired design works. Mimicking cognitive architecture yields better results.
Comparison with Part 1 (TRM)
| Aspect | TRM | HRM |
|---|---|---|
| Parameters | <1,000 | 27M |
| Architecture | Think-Act cycles | Planner-Doer hierarchy |
| Strength | Maze solving | Abstract reasoning |
| Key insight | Iteration | Hierarchical decomposition |
Both use recursive reasoning, but HRM adds hierarchical structure for more complex tasks.
Implementation Details
| Metric | Value |
|---|---|
| Primary Language | TypeScript |
| Source Files | 26 .ts/.tsx, 7 .js |
| Estimated Size | ~4 KLOC |
| Framework | React |
| Build System | npm / Create React App |
| Visualization | Canvas-based rendering |
Good for you if: You want to visualize neural reasoning processes, build interactive ML demos, or learn React with a real project.
Complexity: Low-Moderate. Standard React/TypeScript project. No ML training code—this is a visualization tool for understanding the HRM architecture. Easy to extend with new visualizations.
Key Takeaways
-
Plan, then act. Separating strategy from execution mirrors effective human thinking.
-
Hierarchy enables complexity. Multi-level reasoning handles problems flat networks can’t.
-
Architecture > Scale for reasoning tasks.
-
ARC remains unsolved by brute-force scaling—clever architectures are the path forward.
What’s Next
Part 4 explores Baby Dragon Hatchling (BDH)—a brain-inspired model with visible, interpretable activations.
Resources
Part 3 of the Small Models, Big Brains series. View all parts | Next: Part 4 →
Comments or questions? SW Lab Discord or YouTube @SoftwareWrighter.
