How AI Learns Part 6: Toward Continuous Learning
691 words • 4 min read • Abstract

| Resource | Link |
|---|---|
| Related | Sleepy Coder Part 1 | Sleepy Coder Part 2 |
The Continuous Learning Loop
The Core Tradeoff
| Goal | Description |
|---|---|
| Plasticity | Learn new things quickly |
| Stability | Retain old things reliably |
You cannot maximize both simultaneously. The art is in the balance.
Approaches to Continuous Learning
1. Replay-Based Methods
Keep (or synthesize) some old data. Periodically retrain on old + new.
How it works:
- Store representative examples from each task
- Mix old data into new training batches
- Periodically consolidate
Recent work: FOREVER adapts replay timing using “model-centric time” (based on optimizer update magnitude) rather than fixed training steps.
| Pros | Cons |
|---|---|
| Strong retention | Storage costs |
| Conceptually simple | Privacy concerns |
| Well-understood | Data governance complexity |
2. Replay-Free Regularization
Constrain weight updates to avoid interference, without storing old data.
Efficient Lifelong Learning Algorithm (ELLA) (Jan 2026): Regularizes updates using subspace de-correlation. Reduces interference while allowing transfer.
Share (Feb 2026): Maintains a single evolving shared low-rank subspace. Integrates new tasks without storing many adapters.
| Pros | Cons |
|---|---|
| No replay needed | Still active research |
| Privacy-friendly | Evaluation complexity |
| Constant memory | Subtle failure modes |
3. Modular Adapters
Keep base model frozen. Train task-specific adapters. Merge or switch as needed.
Evolution:
- Low-Rank Adaptation (LoRA): Individual adapters per task
- Shared LoRA spaces: Adapters share subspace
- Adapter banks: Library of skills to compose
| Pros | Cons |
|---|---|
| Modular, versioned | Adapter proliferation |
| Low forgetting risk | Routing complexity |
| Easy rollback | Composition challenges |
4. Memory-First Learning
Store experiences in external memory. Only consolidate to weights what’s proven stable.
Pattern:
- New information → Memory (fast)
- Validated patterns → Adapters (slow)
- Fundamental capabilities → Weights (rare)
This separates the speed of learning from the permanence of changes.
The Practical Loop
A working continuous learning system:
1. Run agent (with Recursive Language Model (RLM) context management)
2. Collect traces: prompts, tool calls, outcomes, failures
3. Score outcomes: tests, static analysis, user signals
4. Cluster recurring failure patterns
5. Train lightweight updates (LoRA/adapters)
6. Validate retention (did old skills degrade?)
7. Deploy modular update (with rollback capability)
This is not real-time learning. It’s periodic consolidation.
Human analogy: Sleep. Process experiences, consolidate important patterns, prune noise.
Time Scales of Update
| Frequency | What Changes | Method |
|---|---|---|
| Every query | Nothing (inference only) | - |
| Per session | Memory | Retrieval-Augmented Generation (RAG)/Engram |
| Daily | Adapters (maybe) | Lightweight Parameter-Efficient Fine-Tuning (PEFT) |
| Weekly | Validated adapters | Reviewed updates |
| Monthly | Core weights | Major consolidation |
Most systems should:
- Update memory frequently
- Update adapters occasionally
- Update core weights rarely
Evaluation Is Critical
Continuous learning without continuous evaluation is dangerous.
Required:
- Retention tests (what got worse?)
- Forward transfer tests (what got better?)
- Regression detection
- Rollback capability
Without these, you’re flying blind.
References
| Concept | Paper |
|---|---|
| ELLA | Subspace Learning for Lifelong ML (2024) |
| Share | Shared LoRA Subspaces (2025) |
| FOREVER | Model-Centric Replay (2024) |
| EWC | Overcoming Catastrophic Forgetting (Kirkpatrick et al. 2017) |
Coming Next
In Part 7, we’ll put it all together: designing a practical continuous learning agent with layered architecture, logging, feedback loops, and safety.
Learn often in memory. Consolidate carefully in weights.
Part 6 of the How AI Learns series. View all parts | Next: Part 7 →