The best large language models score zero on hard mazes. A model with under 1,000 parameters scores 85 percent.

This is Part 1 of the Small Models, Big Brains series, exploring how tiny models with clever architectures outperform massive ones on specific tasks.

Why LLMs Fail at Mazes

Large language models generate one token at a time. They cannot backtrack. One wrong move and the entire solution fails.

Maze solving requires:

  • Exploring dead ends
  • Backtracking when stuck
  • Maintaining spatial awareness
  • Planning multiple steps ahead

Autoregressive generation is fundamentally incompatible with these requirements.

Meet TRM: The Tiny Recursive Model

The Tiny Recursive Model uses under 1,000 parameters. Instead of being bigger, it thinks in loops.

Input → Think → Act → Think → Act → ... → Output

A simple two-layer network that iterates until the solution emerges.

The Architecture

TRM alternates between two phases:

Phase Purpose
Think Update internal latent state by processing input, current answer, and previous state
Act Update the answer based on the refined latent state

This process repeats for multiple cycles, progressively improving the output.

TRMConfig {
    input_dim: 5,
    output_dim: 5,
    hidden_dim: 16,
    latent_dim: 16,
    l_layers: 2,      // Network depth
    h_cycles: 3,      // Outer think-act cycles
    l_cycles: 4,      // Inner think cycles
}

The Secret: Deep Supervision

The key insight isn’t just recursion—it’s supervising every step, not just the final answer.

Traditional training:

Input → [black box] → Final Output → Loss

TRM training:

Input → Step 1 → Loss₁
      → Step 2 → Loss₂
      → Step 3 → Loss₃
      → ...
      → Final  → Loss_n

Every iteration gets feedback. The model learns to make progress at each step.

Results

Model Maze Accuracy
GPT-4 ~0% on hard mazes
Claude ~0% on hard mazes
TRM (976 params) 85%

Iteration beats scale.

Running the Code

The train-trm repo provides a complete Rust implementation:

# Clone and build
git clone https://github.com/softwarewrighter/train-trm
cd train-trm
./scripts/build.sh --release

# Train a model
./scripts/train.sh --epochs 1000 --lr 0.01

# Evaluate
./scripts/eval.sh

# Or launch the web UI
cargo install --locked trunk
./scripts/web-serve.sh

The web UI includes interactive maze visualization with solution paths and real-time training charts.

Implementation Details

Metric Value
Primary Language Rust
Source Files 21 .rs files
Estimated Size ~2.5 KLOC
Also Includes HTML (web UI), Shell scripts
Build System Cargo + Trunk (WASM)
Dependencies ndarray, serde, clap, wasm-bindgen

Good for you if: You want to learn Rust ML from scratch, experiment with recursive architectures, or need a web-based training visualization.

Complexity: Moderate. Clean Rust code with good documentation. The neural network is implemented from scratch (no PyTorch/TensorFlow), making it educational but requiring Rust familiarity.

Key Takeaways

  1. Parameter count isn’t everything. Architecture and training strategy matter more for certain tasks.

  2. Recursion enables backtracking. By iterating, TRM can explore and refine solutions.

  3. Deep supervision accelerates learning. Feedback at every step, not just the end.

  4. Task-specific models excel. TRM isn’t a general-purpose LLM—it’s optimized for maze-like reasoning.

What’s Next

Part 2 explores MobileLLM and running AI completely offline on your Android phone.

Resources