C
D
F
H
I
J
L
M
N
R
S
T
W

Keywords from titles, sorted alphabetically. Click any keyword to read the post.

#
Five ML Concepts - #21
Five ML Concepts - #22
Five ML Concepts - #23
Five ML Concepts - #24
Five ML Concepts - #25
Five ML Concepts - #26
Five ML Concepts - #27
Five ML Concepts - #28
Five ML Concepts - #29
1
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ 16-bit
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ 1965
TBT (1/?): My First Program Was a Horse Race ~ 1972
TBT (2/?): Pipelines on OS/390 ~ 1996 Olympics
2
Lucy 20%: Upgrading My Home AI Cluster
Lucy 20%: Upgrading My Home AI Cluster ~ 24GB VRAM
Small Models (3/6): Planner + Doer = Genius ~ 27M parameters
3
Five ML Concepts - #1 ~ 30-second explainers
Five ML Concepts - #21 ~ 30-second explainers
Five ML Concepts - #22 ~ 30-second explainers
Five ML Concepts - #23 ~ 30-second explainers
Five ML Concepts - #24 ~ 30-second explainers
Five ML Concepts - #25 ~ 30-second explainers
Five ML Concepts - #26 ~ 30-second explainers
Five ML Concepts - #27 ~ 30-second explainers
Five ML Concepts - #28 ~ 30-second explainers
Five ML Concepts - #29 ~ 30-second explainers
Five ML Concepts - #2 ~ 30-second explainers
Five ML Concepts - #3 ~ 30-second explainers
Five ML Concepts - #4 ~ 30-second explainers
Five ML Concepts - #5 ~ 30-second explainers
Five ML Concepts - #6 ~ 30-second explainers
Five ML Concepts - #7 ~ 30-second explainers
Five ML Concepts - #8 ~ 30-second explainers
Five ML Concepts - #9 ~ 30-second explainers
Five ML Concepts - #10 ~ 30-second explainers
Five ML Concepts - #11 ~ 30-second explainers
Five ML Concepts - #12 ~ 30-second explainers
Five ML Concepts - #13 ~ 30-second explainers
Five ML Concepts - #14 ~ 30-second explainers
Five ML Concepts - #15 ~ 30-second explainers
Five ML Concepts - #16 ~ 30-second explainers
Five ML Concepts - #17 ~ 30-second explainers
Five ML Concepts - #18 ~ 30-second explainers
Five ML Concepts - #19 ~ 30-second explainers
Five ML Concepts - #20 ~ 30-second explainers
9
Small Models (1/6): 976 Parameters Beat Billions ~ 976 parameters
Small Models (1/6): 976 Parameters Beat Billions
A
Five ML Concepts - #16 ~ A/B testing
Small Models (3/6): Planner + Doer = Genius ~ abstract reasoning
Five ML Concepts - #4 ~ activation functions
Five ML Concepts - #4 ~ Adam optimizer
How AI Learns Part 6: Toward Continuous Learning ~ adapter evolution
How AI Learns Part 1: The Many Meanings of Learning ~ adapters
How AI Learns Part 3: Weight-Based Learning ~ adapters
Five ML Concepts - #25 ~ adversarial attacks
Five ML Concepts - #25 ~ adversarial examples
midi-cli-rs: Music Generation for AI Coding Agents
How AI Learns Part 7: Designing a Continuous Learning Agent
How AI Learns Part 7: Designing a Continuous Learning Agent ~ AI agent architecture
midi-cli-rs: Music Generation for AI Coding Agents ~ AI agents
Five ML Concepts - #26 ~ AI alignment
Lucy 20%: Upgrading My Home AI Cluster ~ AI hardware
How AI Learns Part 1: The Many Meanings of Learning ~ AI memory
Five ML Concepts - #23 ~ AI tool calling
music-pipe-rs: Unix Pipelines for MIDI Composition ~ algorithmic composition
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ algorithmic composition
Five ML Concepts - #24 ~ alignment
How AI Learns Part 3: Weight-Based Learning ~ alignment
midi-cli-rs: Music Generation for AI Coding Agents ~ ambient
Deepseek Papers (1/3): mHC - Training Stability at Any Depth
Small Models (3/6): Planner + Doer = Genius ~ ARC challenge
TBT (3/?): Vector Graphics Games ~ arcade games
music-pipe-rs: Web Demo and Multi-Instrument Arrangements
TBT (1/?): My First Program Was a Horse Race ~ array programming
JSON et al: A Deep Dive into Data Serialization Formats ~ ASN.1
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ assembly language
TBT (3/?): Vector Graphics Games ~ Asteroids
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot ~ attention dilution
Five ML Concepts - #2 ~ attention mechanism
How AI Learns Part 5: Context Engineering & Recursive Reasoning ~ attention
Five ML Concepts - #14 ~ AUC
midi-cli-rs: Extending with Custom Mood Packs ~ audio
midi-cli-rs: Music Generation for AI Coding Agents ~ audio
music-pipe-rs: Unix Pipelines for MIDI Composition ~ audio
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ audio
Five ML Concepts - #19 ~ autoencoders
JSON et al: A Deep Dive into Data Serialization Formats ~ Avro
B
Small Models (4/6): This AI Has a Visible Brain ~ Baby Dragon Hatchling
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ Bach
Five ML Concepts - #1 ~ backpropagation
Neural-Net-RS: An Educational Neural Network Platform ~ backpropagation
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ Baroque
Five ML Concepts - #16 ~ batch normalization
Five ML Concepts - #12 ~ batch size
TBT (3/?): Vector Graphics Games ~ BattleZone
Small Models (4/6): This AI Has a Visible Brain ~ BDH
Small Models (1/6): 976 Parameters Beat Billions
Five ML Concepts - #17 ~ benchmark leakage
Five ML Concepts - #6 ~ BERT
Five ML Concepts - #8 ~ bias-variance tradeoff
Five ML Concepts - #6 ~ bidirectional encoder
Small Models (1/6): 976 Parameters Beat Billions
JSON et al: A Deep Dive into Data Serialization Formats ~ binary formats
Welcome to Software Wrighter Lab ~ blog
Five ML Concepts - #24 ~ blue/green deployment
Small Models (4/6): This AI Has a Visible Brain
C
How AI Learns Part 4: Memory-Based Learning ~ Cache-Augmented Generation
Five ML Concepts - #26 ~ caching strategies
How AI Learns Part 4: Memory-Based Learning ~ CAG
Five ML Concepts - #13 ~ calibration
Five ML Concepts - #23 ~ canary deployment
Cat Finder: Personal Software via Vibe Coding ~ cat finder
Five ML Concepts - #27 ~ catastrophic forgetting
Five ML Concepts - #15 ~ catastrophic forgetting
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot ~ catastrophic forgetting
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ catastrophic forgetting
Towards Continuous LLM Learning (2): Routing Prevents Forgetting ~ catastrophic forgetting
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
Cat Finder: Personal Software via Vibe Coding
JSON et al: A Deep Dive into Data Serialization Formats ~ CBOR
Five ML Concepts - #11 ~ chain of thought
Five ML Concepts - #13 ~ checkpointing
midi-cli-rs: Extending with Custom Mood Packs ~ chillout
Five ML Concepts - #28 ~ Chinchilla scaling laws
Five ML Concepts - #29 ~ class representation geometry
Cat Finder: Personal Software via Vibe Coding ~ Claude Code
Neural-Net-RS: An Educational Neural Network Platform ~ Claude Code
MCP: Teaching Claude to Play (and Trash Talk)
Cat Finder: Personal Software via Vibe Coding ~ CLI tool
midi-cli-rs: Extending with Custom Mood Packs ~ CLI
midi-cli-rs: Music Generation for AI Coding Agents ~ CLI
music-pipe-rs: Unix Pipelines for MIDI Composition ~ CLI
Lucy 20%: Upgrading My Home AI Cluster
TBT (2/?): Pipelines on OS/390 ~ CMS Pipelines
Five ML Concepts - #10 ~ CNN
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Welcome to Software Wrighter Lab ~ coding agents
Cat Finder: Personal Software via Vibe Coding
midi-cli-rs: Music Generation for AI Coding Agents
Towards Continuous LLM Learning (2): Routing Prevents Forgetting ~ coefficient training
Five ML Concepts - #14 ~ cold start problem
music-pipe-rs: Unix Pipelines for MIDI Composition ~ composable tools
music-pipe-rs: Unix Pipelines for MIDI Composition
Five ML Concepts - #28 ~ compute optimality
Cat Finder: Personal Software via Vibe Coding ~ computer vision
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
Five ML Concepts - #17 ~ concept drift
Five ML Concepts - #10
Five ML Concepts - #11
Five ML Concepts - #12
Five ML Concepts - #13
Five ML Concepts - #14
Five ML Concepts - #15
Five ML Concepts - #16
Five ML Concepts - #17
Five ML Concepts - #18
Five ML Concepts - #19
Five ML Concepts - #1
Five ML Concepts - #20
Five ML Concepts - #21
Five ML Concepts - #22
Five ML Concepts - #23
Five ML Concepts - #24
Five ML Concepts - #25
Five ML Concepts - #26
Five ML Concepts - #27
Five ML Concepts - #28
Five ML Concepts - #29
Five ML Concepts - #2
Five ML Concepts - #3
Five ML Concepts - #4
Five ML Concepts - #5
Five ML Concepts - #6
Five ML Concepts - #7
Five ML Concepts - #8
Five ML Concepts - #9
Five ML Concepts - #28 ~ conditional computation
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers ~ conditional memory
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation ~ conditional memory
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers
Five ML Concepts - #25 ~ confidence calibration
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ console panel
How AI Learns Part 6: Toward Continuous Learning ~ consolidation
Five ML Concepts - #26 ~ constitutional AI
How AI Learns Part 5: Context Engineering & Recursive Reasoning ~ context engineering
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers ~ context extension
How AI Learns Part 5: Context Engineering & Recursive Reasoning ~ context management
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot ~ context rot
Five ML Concepts - #7 ~ context window
RLM: Recursive Language Models for Massive Context ~ context window
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
How AI Learns Part 5: Context Engineering & Recursive Reasoning
RLM: Recursive Language Models for Massive Context
Five ML Concepts - #27 ~ continual learning
Multi-Hop Reasoning (2/2): The Distribution Trap ~ continual learning
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ continual learning
Towards Continuous LLM Learning (2): Routing Prevents Forgetting ~ continual learning
How AI Learns Part 6: Toward Continuous Learning ~ continuous learning
How AI Learns Part 7: Designing a Continuous Learning Agent ~ continuous learning
How AI Learns Part 6: Toward Continuous Learning
How AI Learns Part 7: Designing a Continuous Learning Agent
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Towards Continuous LLM Learning (2): Routing Prevents Forgetting
Five ML Concepts - #10 ~ convolutional neural network
Five ML Concepts - #19 ~ correlation vs causation
Five ML Concepts - #23 ~ cosine annealing
Five ML Concepts - #18 ~ cost vs quality tradeoffs
Five ML Concepts - #11 ~ CoT
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ count-based novelty
Five ML Concepts - #19 ~ covariate shift
Five ML Concepts - #7 ~ cross-validation
TBT (3/?): Vector Graphics Games ~ CRT
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ curiosity-driven exploration
Five ML Concepts - #19 ~ curriculum learning
Five ML Concepts - #15 ~ curse of dimensionality
midi-cli-rs: Extending with Custom Mood Packs ~ custom moods
midi-cli-rs: Extending with Custom Mood Packs
D
Five ML Concepts - #26 ~ data augmentation
Five ML Concepts - #17 ~ data drift
Five ML Concepts - #24 ~ data leakage
Five ML Concepts - #22 ~ data scaling
JSON et al: A Deep Dive into Data Serialization Formats ~ data serialization
TBT (2/?): Pipelines on OS/390 ~ dataflow
JSON et al: A Deep Dive into Data Serialization Formats
music-pipe-rs: Unix Pipelines for MIDI Composition ~ DAW
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ DAW
Deepseek Papers (1/3): mHC - Training Stability at Any Depth ~ deep networks
Deepseek Papers (1/3): mHC - Training Stability at Any Depth
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation
JSON et al: A Deep Dive into Data Serialization Formats
Five ML Concepts - #29 ~ delayed generalization
music-pipe-rs: Web Demo and Multi-Instrument Arrangements
How AI Learns Part 7: Designing a Continuous Learning Agent ~ deployment
Deepseek Papers (1/3): mHC - Training Stability at Any Depth
How AI Learns Part 7: Designing a Continuous Learning Agent
Five ML Concepts - #8 ~ diffusion models
Five ML Concepts - #26 ~ dimensionality reduction
Five ML Concepts - #2 ~ direct preference optimization
How AI Learns Part 3: Weight-Based Learning ~ Direct Preference Optimization
Welcome to Software Wrighter Lab ~ Discord
Five ML Concepts - #10 ~ distillation
How AI Learns Part 3: Weight-Based Learning ~ distillation
Multi-Hop Reasoning (2/2): The Distribution Trap ~ distribution matching
Five ML Concepts - #11 ~ distribution shift
Multi-Hop Reasoning (2/2): The Distribution Trap
Many-Eyes Learning: Intrinsic Rewards and Diversity
JSON et al: A Deep Dive into Data Serialization Formats
Small Models (3/6): Planner + Doer = Genius
Five ML Concepts - #25 ~ double descent
Five ML Concepts - #2 ~ DPO
How AI Learns Part 3: Weight-Based Learning ~ DPO
Solving Sparse Rewards with Many Eyes ~ DQN
Five ML Concepts - #15 ~ drift detection
Five ML Concepts - #6 ~ dropout
Five ML Concepts - #9 ~ dropout
Five ML Concepts - #28 ~ dynamic routing
DyTopo: Dynamic Topology for Multi-Agent AI ~ dynamic topology
DyTopo: Dynamic Topology for Multi-Agent AI
DyTopo: Dynamic Topology for Multi-Agent AI ~ DyTopo
DyTopo: Dynamic Topology for Multi-Agent AI
E
Five ML Concepts - #13 ~ early stopping
TBT (4/?): ToonTalk - Teaching Robots to Program ~ educational programming
Neural-Net-RS: An Educational Neural Network Platform
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ educational
Small Models (1/6): 976 Parameters Beat Billions ~ efficiency
Small Models (6/6): Which Small AI Fits YOUR Laptop? ~ efficient frontier
How AI Learns Part 6: Toward Continuous Learning ~ Efficient Lifelong Learning Algorithm
Small Models (5/6): Max AI Per Watt ~ efficient LLM
Small Models (6/6): Which Small AI Fits YOUR Laptop? ~ efficient LLM
Five ML Concepts - #28 ~ efficient scaling
Five ML Concepts - #27 ~ elastic weight consolidation
midi-cli-rs: Extending with Custom Mood Packs ~ electronic
Small Models (2/6): AI in Your Pocket ~ Eliza
How AI Learns Part 6: Toward Continuous Learning ~ ELLA
Five ML Concepts - #1 ~ embeddings
Five ML Concepts - #23 ~ emergent behavior
Five ML Concepts - #23 ~ emergent capabilities
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
Five ML Concepts - #10 ~ encoder-decoder
How AI Learns Part 5: Context Engineering & Recursive Reasoning
In-Context Learning Revisited: From Mystery to Engineering
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation ~ engram
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation
How AI Learns Part 4: Memory-Based Learning ~ Engram
Five ML Concepts - #18 ~ ensembling
Five ML Concepts - #18 ~ epoch
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ epsilon decay
music-pipe-rs: Unix Pipelines for MIDI Composition ~ Euclidean rhythm
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ Euclidean rhythm
How AI Learns Part 7: Designing a Continuous Learning Agent ~ evaluation
Five ML Concepts - #27 ~ EWC
Five ML Concepts - #27 ~ experience replay
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ exploration strategies
Solving Sparse Rewards with Many Eyes ~ exploration strategies
midi-cli-rs: Extending with Custom Mood Packs
midi-cli-rs: Extending with Custom Mood Packs ~ extensibility
Five ML Concepts - #27 ~ external memory
How AI Learns Part 4: Memory-Based Learning ~ external memory
Solving Sparse Rewards with Many Eyes
F
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Five ML Concepts - #19 ~ failure analysis
Five ML Concepts - #25 ~ feature learning
Five ML Concepts - #29 ~ feedback loops
How AI Learns Part 7: Designing a Continuous Learning Agent ~ feedback loops
Five ML Concepts - #10 ~ few-shot learning
In-Context Learning Revisited: From Mystery to Engineering ~ few-shot learning
Cat Finder: Personal Software via Vibe Coding
Five ML Concepts - #3 ~ fine-tuning
How AI Learns Part 1: The Many Meanings of Learning ~ fine-tuning
How AI Learns Part 3: Weight-Based Learning ~ fine-tuning
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs ~ fine-tuning
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ fine-tuning
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
TBT (1/?): My First Program Was a Horse Race
Five ML Concepts - #27 ~ Fisher information
Small Models (6/6): Which Small AI Fits YOUR Laptop?
Five ML Concepts - #10
Five ML Concepts - #11
Five ML Concepts - #12
Five ML Concepts - #13
Five ML Concepts - #14
Five ML Concepts - #15
Five ML Concepts - #16
Five ML Concepts - #17
Five ML Concepts - #18
Five ML Concepts - #19
Five ML Concepts - #1
Five ML Concepts - #20
Five ML Concepts - #21
Five ML Concepts - #22
Five ML Concepts - #23
Five ML Concepts - #24
Five ML Concepts - #25
Five ML Concepts - #26
Five ML Concepts - #27
Five ML Concepts - #28
Five ML Concepts - #29
Five ML Concepts - #2
Five ML Concepts - #3
Five ML Concepts - #4
Five ML Concepts - #5
Five ML Concepts - #6
Five ML Concepts - #7
Five ML Concepts - #8
Five ML Concepts - #9
Five ML Concepts - #9 ~ Flash Attention
Five ML Concepts - #23 ~ flat minima
Five ML Concepts - #29 ~ flat minima
midi-cli-rs: Music Generation for AI Coding Agents ~ FluidSynth
Lucy 20%: Upgrading My Home AI Cluster ~ FLUX schnell
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
Towards Continuous LLM Learning (2): Routing Prevents Forgetting
JSON et al: A Deep Dive into Data Serialization Formats
G
MCP: Teaching Claude to Play (and Trash Talk) ~ game server
TBT (3/?): Vector Graphics Games
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ GarageBand
Five ML Concepts - #21 ~ gated recurrent unit
Five ML Concepts - #11 ~ gated recurrent unit
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation ~ gating mechanism
Small Models (6/6): Which Small AI Fits YOUR Laptop? ~ Gemma-2B
Five ML Concepts - #16 ~ generalization
midi-cli-rs: Music Generation for AI Coding Agents
Five ML Concepts - #24 ~ generative models
midi-cli-rs: Extending with Custom Mood Packs ~ generative music
music-pipe-rs: Unix Pipelines for MIDI Composition ~ generative music
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ generative music
Five ML Concepts - #7 ~ generative pre-trained transformer
Small Models (3/6): Planner + Doer = Genius
Small Models (2/6): AI in Your Pocket ~ GGUF
Welcome to Software Wrighter Lab ~ GitHub
TBT (1/?): My First Program Was a Horse Race ~ GNU APL
Five ML Concepts - #26 ~ Goodhart's law
In-Context Learning Revisited: From Mystery to Engineering ~ GPT-3
Five ML Concepts - #7 ~ GPT
Five ML Concepts - #7 ~ GQA
Five ML Concepts - #14 ~ gradient clipping
Five ML Concepts - #2 ~ gradient descent
In-Context Learning Revisited: From Mystery to Engineering ~ gradient descent
Five ML Concepts - #20 ~ gradient noise
TBT (3/?): Vector Graphics Games
Five ML Concepts - #29 ~ grokking
Five ML Concepts - #7 ~ grouped query attention
Five ML Concepts - #21 ~ GRU
Five ML Concepts - #11 ~ GRU
H
Five ML Concepts - #1 ~ hallucination
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation ~ hash-based memory
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ Hollerith
Lucy 20%: Upgrading My Home AI Cluster ~ home AI cluster
Lucy 20%: Upgrading My Home AI Cluster ~ homelab
Lucy 20%: Upgrading My Home AI Cluster
TBT (1/?): My First Program Was a Horse Race
How AI Learns Part 1: The Many Meanings of Learning
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
How AI Learns Part 3: Weight-Based Learning
How AI Learns Part 4: Memory-Based Learning
How AI Learns Part 5: Context Engineering & Recursive Reasoning
How AI Learns Part 6: Toward Continuous Learning
How AI Learns Part 7: Designing a Continuous Learning Agent
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation ~ HuggingFace integration
Five ML Concepts - #20 ~ human-in-the-loop
music-pipe-rs: Unix Pipelines for MIDI Composition ~ humanize
I
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ IBM 029
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ IBM 1130
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ IBM 1442
TBT (3/?): Vector Graphics Games ~ IBM 2250
TBT (1/?): My First Program Was a Horse Race ~ IBM 2741
TBT (2/?): Pipelines on OS/390 ~ IBM S/390
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
Five ML Concepts - #5 ~ ICL
How AI Learns Part 1: The Many Meanings of Learning ~ ICL
How AI Learns Part 5: Context Engineering & Recursive Reasoning ~ ICL
In-Context Learning Revisited: From Mystery to Engineering ~ ICL
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation
Five ML Concepts - #5 ~ in-context learning
How AI Learns Part 1: The Many Meanings of Learning ~ in-context learning
How AI Learns Part 5: Context Engineering & Recursive Reasoning ~ in-context learning
In-Context Learning Revisited: From Mystery to Engineering ~ in-context learning
In-Context Learning Revisited: From Mystery to Engineering
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ indicator lights
In-Context Learning Revisited: From Mystery to Engineering ~ induction heads
Five ML Concepts - #12 ~ inductive bias
Five ML Concepts - #26 ~ inference latency
Five ML Concepts - #28 ~ inference parallelism
Five ML Concepts - #9 ~ inference
Solving Sparse Rewards with Many Eyes ~ information bottleneck
Deepseek Papers (1/3): mHC - Training Stability at Any Depth ~ initialization
Five ML Concepts - #25 ~ interpolation threshold
Five ML Concepts - #20 ~ interpretability
Small Models (4/6): This AI Has a Visible Brain ~ interpretable AI
JSON et al: A Deep Dive into Data Serialization Formats
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ intrinsic rewards
Many-Eyes Learning: Intrinsic Rewards and Diversity
J
Five ML Concepts - #21 ~ jailbreaks
midi-cli-rs: Music Generation for AI Coding Agents ~ jazz
Small Models (2/6): AI in Your Pocket ~ Jetpack Compose
Small Models (2/6): AI in Your Pocket ~ JNI
JSON et al: A Deep Dive into Data Serialization Formats ~ JSONB
JSON et al: A Deep Dive into Data Serialization Formats ~ JSONL
MCP: Teaching Claude to Play (and Trash Talk) ~ JSON-RPC
JSON et al: A Deep Dive into Data Serialization Formats ~ JSON
JSON et al: A Deep Dive into Data Serialization Formats
K
TBT (4/?): ToonTalk - Teaching Robots to Program ~ Ken Kahn
TBT (1/?): My First Program Was a Horse Race ~ Kenneth Iverson
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ keypunch
Five ML Concepts - #8 ~ key-value cache
Five ML Concepts - #7 ~ k-fold
Five ML Concepts - #27 ~ knowledge editing
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs ~ knowledge graphs
Five ML Concepts - #8 ~ KV cache
L
Five ML Concepts - #17 ~ L2 regularization
Five ML Concepts - #6 ~ L2 weight decay
Five ML Concepts - #25 ~ label smoothing
Welcome to Software Wrighter Lab
RLM: Recursive Language Models for Massive Context
Small Models (6/6): Which Small AI Fits YOUR Laptop?
RLM: Recursive Language Models for Massive Context ~ large context
Five ML Concepts - #12 ~ latency
Five ML Concepts - #5 ~ latent space
How AI Learns Part 7: Designing a Continuous Learning Agent ~ layered architecture
Five ML Concepts - #23 ~ learning rate schedules
Five ML Concepts - #24 ~ learning rate warmup
Five ML Concepts - #2 ~ learning rate
How AI Learns Part 1: The Many Meanings of Learning
How AI Learns Part 3: Weight-Based Learning
How AI Learns Part 4: Memory-Based Learning
How AI Learns Part 6: Toward Continuous Learning
How AI Learns Part 7: Designing a Continuous Learning Agent
In-Context Learning Revisited: From Mystery to Engineering
Many-Eyes Learning: Intrinsic Rewards and Diversity
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Towards Continuous LLM Learning (2): Routing Prevents Forgetting
How AI Learns Part 1: The Many Meanings of Learning
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
How AI Learns Part 3: Weight-Based Learning
How AI Learns Part 4: Memory-Based Learning
How AI Learns Part 5: Context Engineering & Recursive Reasoning
How AI Learns Part 6: Toward Continuous Learning
How AI Learns Part 7: Designing a Continuous Learning Agent
How AI Learns Part 6: Toward Continuous Learning ~ lifelong learning
Small Models (5/6): Max AI Per Watt ~ Llama-3.2-1B
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot ~ LLM failure modes
How AI Learns Part 1: The Many Meanings of Learning ~ LLM learning
RLM: Recursive Language Models for Massive Context ~ LLM tools
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Towards Continuous LLM Learning (2): Routing Prevents Forgetting
Lucy 20%: Upgrading My Home AI Cluster ~ local AI
Cat Finder: Personal Software via Vibe Coding ~ local ML
How AI Learns Part 7: Designing a Continuous Learning Agent ~ logging
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers ~ long context
Five ML Concepts - #22 ~ Long Short-Term Memory
Five ML Concepts - #11 ~ long short-term memory
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation ~ long-term recall
Towards Continuous LLM Learning (2): Routing Prevents Forgetting ~ LoRA routing
Five ML Concepts - #3 ~ LoRA
How AI Learns Part 1: The Many Meanings of Learning ~ LoRA
How AI Learns Part 3: Weight-Based Learning ~ LoRA
How AI Learns Part 7: Designing a Continuous Learning Agent ~ LoRA
Small Models (5/6): Max AI Per Watt ~ LoRA
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ LoRA
Five ML Concepts - #3 ~ loss function
Five ML Concepts - #14 ~ loss landscapes
Five ML Concepts - #23 ~ loss surface sharpness
Five ML Concepts - #28 ~ lottery ticket hypothesis
Five ML Concepts - #3 ~ low-rank adaptation
How AI Learns Part 1: The Many Meanings of Learning ~ Low-Rank Adaptation
How AI Learns Part 3: Weight-Based Learning ~ Low-Rank Adaptation
How AI Learns Part 7: Designing a Continuous Learning Agent ~ Low-Rank Adaptation
Small Models (5/6): Max AI Per Watt ~ Low-Rank Adaptation
Five ML Concepts - #22 ~ LSTM
Five ML Concepts - #11 ~ LSTM
Lucy 20%: Upgrading My Home AI Cluster ~ Lucy AI
Lucy 20%: Upgrading My Home AI Cluster
TBT (3/?): Vector Graphics Games ~ Lunar Lander
M
Five ML Concepts - #1 ~ machine learning concepts
Five ML Concepts - #21 ~ machine learning concepts
Five ML Concepts - #22 ~ machine learning concepts
Five ML Concepts - #23 ~ machine learning concepts
Five ML Concepts - #24 ~ machine learning concepts
Five ML Concepts - #25 ~ machine learning concepts
Five ML Concepts - #26 ~ machine learning concepts
Five ML Concepts - #27 ~ machine learning concepts
Five ML Concepts - #28 ~ machine learning concepts
Five ML Concepts - #29 ~ machine learning concepts
Five ML Concepts - #2 ~ machine learning concepts
Five ML Concepts - #3 ~ machine learning concepts
Five ML Concepts - #4 ~ machine learning concepts
Five ML Concepts - #5 ~ machine learning concepts
Five ML Concepts - #6 ~ machine learning concepts
Five ML Concepts - #7 ~ machine learning concepts
Five ML Concepts - #8 ~ machine learning concepts
Five ML Concepts - #9 ~ machine learning concepts
Five ML Concepts - #10 ~ machine learning concepts
Five ML Concepts - #11 ~ machine learning concepts
Five ML Concepts - #12 ~ machine learning concepts
Five ML Concepts - #13 ~ machine learning concepts
Five ML Concepts - #14 ~ machine learning concepts
Five ML Concepts - #15 ~ machine learning concepts
Five ML Concepts - #16 ~ machine learning concepts
Five ML Concepts - #17 ~ machine learning concepts
Five ML Concepts - #18 ~ machine learning concepts
Five ML Concepts - #19 ~ machine learning concepts
Five ML Concepts - #20 ~ machine learning concepts
Neural-Net-RS: An Educational Neural Network Platform ~ machine learning education
TBT (2/?): Pipelines on OS/390 ~ mainframe
Five ML Concepts - #1 ~ Mamba SSM
Five ML Concepts - #26 ~ manifold hypothesis
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ many-eyes learning
Many-Eyes Learning: Intrinsic Rewards and Diversity
How AI Learns Part 1: The Many Meanings of Learning
Solving Sparse Rewards with Many Eyes
RLM: Recursive Language Models for Massive Context
Small Models (5/6): Max AI Per Watt
Small Models (1/6): 976 Parameters Beat Billions ~ maze solving
MCP: Teaching Claude to Play (and Trash Talk)
How AI Learns Part 1: The Many Meanings of Learning
Five ML Concepts - #29 ~ mechanistic interpretability
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers ~ memory retrieval
Five ML Concepts - #27 ~ memory-augmented networks
How AI Learns Part 4: Memory-Based Learning
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers
JSON et al: A Deep Dive into Data Serialization Formats ~ MessagePack
In-Context Learning Revisited: From Mystery to Engineering ~ meta-learning
Deepseek Papers (1/3): mHC - Training Stability at Any Depth
midi-cli-rs: Extending with Custom Mood Packs
midi-cli-rs: Music Generation for AI Coding Agents
midi-cli-rs: Extending with Custom Mood Packs ~ MIDI
midi-cli-rs: Music Generation for AI Coding Agents ~ MIDI
music-pipe-rs: Unix Pipelines for MIDI Composition ~ MIDI
music-pipe-rs: Unix Pipelines for MIDI Composition
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ MIDI
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ minicomputer
Five ML Concepts - #25 ~ miscalibration
RLM: Recursive Language Models for Massive Context ~ MIT
Five ML Concepts - #8 ~ mixed precision training
Five ML Concepts - #27 ~ mixture of experts
Five ML Concepts - #28 ~ mixture of experts
Five ML Concepts - #11 ~ mixture of experts
Five ML Concepts - #18 ~ ML fragility
Five ML Concepts - #8 ~ MLA
Five ML Concepts - #21 ~ MLOps
Five ML Concepts - #22 ~ MLOps
Five ML Concepts - #23 ~ MLOps
Five ML Concepts - #24 ~ MLOps
Small Models (6/6): Which Small AI Fits YOUR Laptop? ~ MMLU benchmark
Small Models (2/6): AI in Your Pocket ~ MobileLLM
Five ML Concepts - #24 ~ mode collapse
Five ML Concepts - #27 ~ model editing
How AI Learns Part 4: Memory-Based Learning ~ model editing
Five ML Concepts - #22 ~ model steerability
RLM: Recursive Language Models for Massive Context
Small Models (1/6): 976 Parameters Beat Billions
Small Models (2/6): AI in Your Pocket
Small Models (3/6): Planner + Doer = Genius
Small Models (4/6): This AI Has a Visible Brain
Small Models (5/6): Max AI Per Watt
Small Models (6/6): Which Small AI Fits YOUR Laptop?
Five ML Concepts - #11 ~ MoE
Five ML Concepts - #15 ~ monitoring
midi-cli-rs: Extending with Custom Mood Packs ~ mood packs
midi-cli-rs: Music Generation for AI Coding Agents ~ mood presets
midi-cli-rs: Extending with Custom Mood Packs
Five ML Concepts - #22 ~ more data beats better models
music-pipe-rs: Unix Pipelines for MIDI Composition ~ motif
Solving Sparse Rewards with Many Eyes ~ multi-agent exploration
DyTopo: Dynamic Topology for Multi-Agent AI ~ multi-agent systems
DyTopo: Dynamic Topology for Multi-Agent AI
Five ML Concepts - #8 ~ multi-head latent attention
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
Multi-Hop Reasoning (2/2): The Distribution Trap
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ multi-instrument
music-pipe-rs: Web Demo and Multi-Instrument Arrangements
midi-cli-rs: Extending with Custom Mood Packs ~ music generation
midi-cli-rs: Music Generation for AI Coding Agents ~ music generation
music-pipe-rs: Unix Pipelines for MIDI Composition ~ music generation
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ music-pipe-rs
music-pipe-rs: Unix Pipelines for MIDI Composition
music-pipe-rs: Web Demo and Multi-Instrument Arrangements
midi-cli-rs: Extending with Custom Mood Packs ~ music
midi-cli-rs: Music Generation for AI Coding Agents ~ music
midi-cli-rs: Music Generation for AI Coding Agents
music-pipe-rs: Unix Pipelines for MIDI Composition ~ music
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ music
TBT (2/?): Pipelines on OS/390 ~ MVS
In-Context Learning Revisited: From Mystery to Engineering
N
JSON et al: A Deep Dive into Data Serialization Formats ~ NDJSON
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ negative result
Neural-Net-RS: An Educational Neural Network Platform
Five ML Concepts - #29 ~ neural collapse
Small Models (4/6): This AI Has a Visible Brain ~ neural interpretability
Five ML Concepts - #29 ~ neural network circuits
Five ML Concepts - #28 ~ neural network pruning
Neural-Net-RS: An Educational Neural Network Platform ~ neural network
Neural-Net-RS: An Educational Neural Network Platform
Neural-Net-RS: An Educational Neural Network Platform
Deepseek Papers (1/3): mHC - Training Stability at Any Depth ~ normalization
O
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation ~ O(1) lookup
Small Models (3/6): Planner + Doer = Genius ~ o3-mini
Cat Finder: Personal Software via Vibe Coding ~ object detection
Small Models (2/6): AI in Your Pocket ~ offline AI
Small Models (5/6): Max AI Per Watt ~ one billion parameters
Cat Finder: Personal Software via Vibe Coding ~ ONNX Runtime
Five ML Concepts - #12 ~ OOD
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ optimistic initialization
Five ML Concepts - #26 ~ optimization metrics
Five ML Concepts - #16 ~ optimization
TBT (2/?): Pipelines on OS/390 ~ OS/390
Five ML Concepts - #12 ~ out-of-distribution
Five ML Concepts - #16 ~ overconfidence
Five ML Concepts - #3 ~ overfitting
P
midi-cli-rs: Extending with Custom Mood Packs
Deepseek Papers (1/3): mHC - Training Stability at Any Depth
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation
Five ML Concepts - #27 ~ parameter routing
How AI Learns Part 3: Weight-Based Learning ~ Parameter-Efficient Fine-Tuning
How AI Learns Part 6: Toward Continuous Learning ~ Parameter-Efficient Fine-Tuning
How AI Learns Part 7: Designing a Continuous Learning Agent ~ Parameter-Efficient Fine-Tuning
Small Models (1/6): 976 Parameters Beat Billions
JSON et al: A Deep Dive into Data Serialization Formats ~ Parquet
How AI Learns Part 3: Weight-Based Learning ~ PEFT
How AI Learns Part 6: Toward Continuous Learning ~ PEFT
How AI Learns Part 7: Designing a Continuous Learning Agent ~ PEFT
Five ML Concepts - #5 ~ perceptron
Five ML Concepts - #15 ~ perplexity
Cat Finder: Personal Software via Vibe Coding ~ personal software
Neural-Net-RS: An Educational Neural Network Platform ~ personal software
Cat Finder: Personal Software via Vibe Coding
Small Models (5/6): Max AI Per Watt
Small Models (6/6): Which Small AI Fits YOUR Laptop? ~ Phi-2
TBT (3/?): Vector Graphics Games ~ phosphor
Five ML Concepts - #28 ~ pipeline parallelism
TBT (2/?): Pipelines on OS/390
music-pipe-rs: Unix Pipelines for MIDI Composition
Small Models (3/6): Planner + Doer = Genius ~ planner-doer architecture
Small Models (3/6): Planner + Doer = Genius
Five ML Concepts - #21 ~ planning vs prediction
Neural-Net-RS: An Educational Neural Network Platform
MCP: Teaching Claude to Play (and Trash Talk)
midi-cli-rs: Extending with Custom Mood Packs ~ plugins
Small Models (2/6): AI in Your Pocket
TBT (3/?): Vector Graphics Games ~ Pong
Five ML Concepts - #6 ~ positional encoding
Five ML Concepts - #12 ~ precision
Five ML Concepts - #18 ~ preference learning
Five ML Concepts - #5 ~ pre-training
How AI Learns Part 1: The Many Meanings of Learning ~ pretraining
How AI Learns Part 3: Weight-Based Learning ~ pretraining
Towards Continuous LLM Learning (2): Routing Prevents Forgetting
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ printer
Cat Finder: Personal Software via Vibe Coding ~ privacy-first
Five ML Concepts - #21 ~ production rollbacks
TBT (4/?): ToonTalk - Teaching Robots to Program ~ programming by demonstration
TBT (1/?): My First Program Was a Horse Race
TBT (4/?): ToonTalk - Teaching Robots to Program
Five ML Concepts - #6 ~ prompt engineering
Five ML Concepts - #21 ~ prompt injection
Five ML Concepts - #6 ~ prompting
JSON et al: A Deep Dive into Data Serialization Formats ~ Protobuf
JSON et al: A Deep Dive into Data Serialization Formats ~ Protocol Buffers
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ punch cards
Small Models (5/6): Max AI Per Watt ~ Pythia
Q
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ Q-learning
Five ML Concepts - #9 ~ quantization
R
TBT (1/?): My First Program Was a Horse Race
Five ML Concepts - #10 ~ RAG
How AI Learns Part 1: The Many Meanings of Learning ~ RAG
How AI Learns Part 4: Memory-Based Learning ~ RAG
How AI Learns Part 7: Designing a Continuous Learning Agent ~ RAG
How AI Learns Part 5: Context Engineering & Recursive Reasoning
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
Multi-Hop Reasoning (2/2): The Distribution Trap
Five ML Concepts - #12 ~ recall
TBT (2/?): Pipelines on OS/390 ~ record-at-a-time
Five ML Concepts - #11 ~ recurrent neural network
Small Models (1/6): 976 Parameters Beat Billions ~ recursive depth
How AI Learns Part 5: Context Engineering & Recursive Reasoning ~ recursive language models
How AI Learns Part 6: Toward Continuous Learning ~ Recursive Language Models
How AI Learns Part 7: Designing a Continuous Learning Agent ~ Recursive Language Models
RLM: Recursive Language Models for Massive Context ~ recursive language models
How AI Learns Part 5: Context Engineering & Recursive Reasoning
RLM: Recursive Language Models for Massive Context
Five ML Concepts - #6 ~ regularization
Five ML Concepts - #9 ~ reinforcement learning from human feedback
How AI Learns Part 3: Weight-Based Learning ~ Reinforcement Learning from Human Feedback
Five ML Concepts - #22 ~ Rejection Sampling Fine-Tuning
Multi-Hop Reasoning (2/2): The Distribution Trap ~ rejection sampling
Five ML Concepts - #4 ~ ReLU
Five ML Concepts - #27 ~ replay buffers
How AI Learns Part 6: Toward Continuous Learning ~ replay
Five ML Concepts - #25 ~ representation learning
Five ML Concepts - #10 ~ retrieval-augmented generation
How AI Learns Part 1: The Many Meanings of Learning ~ Retrieval-Augmented Generation
How AI Learns Part 4: Memory-Based Learning ~ Retrieval-Augmented Generation
How AI Learns Part 7: Designing a Continuous Learning Agent ~ Retrieval-Augmented Generation
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ retro computing
Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation
In-Context Learning Revisited: From Mystery to Engineering
Five ML Concepts - #24 ~ reward hacking
Many-Eyes Learning: Intrinsic Rewards and Diversity
Solving Sparse Rewards with Many Eyes
Five ML Concepts - #9 ~ RLHF
How AI Learns Part 3: Weight-Based Learning ~ RLHF
How AI Learns Part 5: Context Engineering & Recursive Reasoning ~ RLM
How AI Learns Part 6: Toward Continuous Learning ~ RLM
How AI Learns Part 7: Designing a Continuous Learning Agent ~ RLM
RLM: Recursive Language Models for Massive Context ~ RLM
RLM: Recursive Language Models for Massive Context
Five ML Concepts - #11 ~ RNN
TBT (4/?): ToonTalk - Teaching Robots to Program
Five ML Concepts - #14 ~ ROC
Five ML Concepts - #6 ~ RoPE
Five ML Concepts - #6 ~ rotary positional embeddings
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
Towards Continuous LLM Learning (2): Routing Prevents Forgetting
Five ML Concepts - #22 ~ RSFT
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs ~ RSFT
Multi-Hop Reasoning (2/2): The Distribution Trap ~ RSFT
Lucy 20%: Upgrading My Home AI Cluster ~ RTX 3090
Cat Finder: Personal Software via Vibe Coding ~ Rust
DyTopo: Dynamic Topology for Multi-Agent AI ~ Rust
Neural-Net-RS: An Educational Neural Network Platform ~ Rust
RLM: Recursive Language Models for Massive Context ~ Rust
TBT (2/?): Pipelines on OS/390 ~ Rust
TBT (3/?): Vector Graphics Games ~ Rust
TBT (4/?): ToonTalk - Teaching Robots to Program ~ Rust
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ Rust
midi-cli-rs: Extending with Custom Mood Packs ~ Rust
midi-cli-rs: Music Generation for AI Coding Agents ~ Rust
music-pipe-rs: Unix Pipelines for MIDI Composition ~ Rust
S
How AI Learns Part 7: Designing a Continuous Learning Agent ~ safety
Five ML Concepts - #29 ~ SAM
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs ~ scaffolded training
music-pipe-rs: Unix Pipelines for MIDI Composition ~ scale
Five ML Concepts - #17 ~ scaling laws
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ scout diversity
Solving Sparse Rewards with Many Eyes ~ scout-based learning
TBT (1/?): My First Program Was a Horse Race ~ Selectric typeball
Five ML Concepts - #7 ~ self-attention
Five ML Concepts - #29 ~ self-training instability
DyTopo: Dynamic Topology for Multi-Agent AI ~ semantic routing
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ seq command
JSON et al: A Deep Dive into Data Serialization Formats
How AI Learns Part 3: Weight-Based Learning ~ SFT
Five ML Concepts - #17 ~ shadow deployment
Small Models (1/6): 976 Parameters Beat Billions ~ Shakespeare training
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ Share algorithm
Towards Continuous LLM Learning (2): Routing Prevents Forgetting ~ Share algorithm
Many-Eyes Learning: Intrinsic Rewards and Diversity ~ shared Q-table
How AI Learns Part 6: Toward Continuous Learning ~ Share
Five ML Concepts - #29 ~ sharpness-aware minimization
Five ML Concepts - #13 ~ shortcut learning
Five ML Concepts - #4 ~ sigmoid
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ Singular Value Decomposition
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Small Models (4/6): This AI Has a Visible Brain ~ small language models
Small Models (6/6): Which Small AI Fits YOUR Laptop? ~ small language models
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
Small Models (6/6): Which Small AI Fits YOUR Laptop?
Small Models (1/6): 976 Parameters Beat Billions
Small Models (2/6): AI in Your Pocket
Small Models (3/6): Planner + Doer = Genius
Small Models (4/6): This AI Has a Visible Brain
Small Models (5/6): Max AI Per Watt
Small Models (6/6): Which Small AI Fits YOUR Laptop?
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs ~ SmolLM-135M
Multi-Hop Reasoning (2/2): The Distribution Trap ~ SmolLM-360M
Small Models (6/6): Which Small AI Fits YOUR Laptop? ~ SmolLM
Five ML Concepts - #25 ~ soft labels
Five ML Concepts - #11 ~ softmax
Cat Finder: Personal Software via Vibe Coding
Welcome to Software Wrighter Lab
Solving Sparse Rewards with Many Eyes
midi-cli-rs: Music Generation for AI Coding Agents ~ SoundFont
midi-cli-rs: Extending with Custom Mood Packs ~ sound
midi-cli-rs: Music Generation for AI Coding Agents ~ sound
music-pipe-rs: Unix Pipelines for MIDI Composition ~ sound
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ sound
Small Models (4/6): This AI Has a Visible Brain ~ sparse activations
Five ML Concepts - #28 ~ sparse activation
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers ~ sparse attention
Small Models (4/6): This AI Has a Visible Brain ~ sparse coding
DyTopo: Dynamic Topology for Multi-Agent AI ~ sparse graphs
Solving Sparse Rewards with Many Eyes ~ sparse rewards
Solving Sparse Rewards with Many Eyes
Five ML Concepts - #5 ~ speculative decoding
Small Models (5/6): Max AI Per Watt ~ speculative decoding
Five ML Concepts - #14 ~ spurious correlations
How AI Learns Part 2: Catastrophic Forgetting vs Context Rot ~ stability plasticity tradeoff
Deepseek Papers (1/3): mHC - Training Stability at Any Depth
Small Models (5/6): Max AI Per Watt ~ StableLM
How AI Learns Part 6: Toward Continuous Learning ~ subspace regularization
Five ML Concepts - #4 ~ superposition
How AI Learns Part 3: Weight-Based Learning ~ Supervised Fine-Tuning
midi-cli-rs: Music Generation for AI Coding Agents ~ suspense
Lucy 20%: Upgrading My Home AI Cluster ~ SVD
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ SVD
Towards Continuous LLM Learning (2): Routing Prevents Forgetting ~ SVD
Small Models (6/6): Which Small AI Fits YOUR Laptop? ~ synthetic training data
midi-cli-rs: Extending with Custom Mood Packs ~ synthwave
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ system emulator
Five ML Concepts - #22 ~ system reliability
Welcome to Software Wrighter Lab ~ systems programming
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
T
MCP: Teaching Claude to Play (and Trash Talk)
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ TBT
TBT (1/?): My First Program Was a Horse Race
TBT (2/?): Pipelines on OS/390
TBT (3/?): Vector Graphics Games
TBT (4/?): ToonTalk - Teaching Robots to Program
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
MCP: Teaching Claude to Play (and Trash Talk)
TBT (4/?): ToonTalk - Teaching Robots to Program
Five ML Concepts - #2 ~ temperature sampling
TBT (3/?): Vector Graphics Games ~ Tempest
Five ML Concepts - #28 ~ tensor parallelism
Lucy 20%: Upgrading My Home AI Cluster ~ text-to-image
Lucy 20%: Upgrading My Home AI Cluster ~ text-to-video
Small Models (2/6): AI in Your Pocket ~ therapist chatbot
Five ML Concepts - #12 ~ throughput
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ Throwback Thursday
MCP: Teaching Claude to Play (and Trash Talk) ~ tic-tac-toe
Small Models (5/6): Max AI Per Watt ~ TinyLlama
Five ML Concepts - #3 ~ tokenization
JSON et al: A Deep Dive into Data Serialization Formats ~ TOML
midi-cli-rs: Extending with Custom Mood Packs ~ TOML
Five ML Concepts - #23 ~ tool use
How AI Learns Part 5: Context Engineering & Recursive Reasoning ~ tool use
TBT (4/?): ToonTalk - Teaching Robots to Program ~ ToonTalk
TBT (4/?): ToonTalk - Teaching Robots to Program
JSON et al: A Deep Dive into Data Serialization Formats ~ TOON
DyTopo: Dynamic Topology for Multi-Agent AI
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Towards Continuous LLM Learning (2): Routing Prevents Forgetting
How AI Learns Part 6: Toward Continuous Learning
Five ML Concepts - #16 ~ train validation test split
Five ML Concepts - #24 ~ training contamination
Deepseek Papers (1/3): mHC - Training Stability at Any Depth ~ training stability
Five ML Concepts - #26 ~ training transformations
Deepseek Papers (1/3): mHC - Training Stability at Any Depth
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
Five ML Concepts - #4 ~ transfer learning
Five ML Concepts - #1 ~ transformer architecture
Deepseek Papers (2/3): Engram - Conditional Memory for Transformers
In-Context Learning Revisited: From Mystery to Engineering ~ transformers
Multi-Hop Reasoning (2/2): The Distribution Trap
MCP: Teaching Claude to Play (and Trash Talk) ~ trash talk
MCP: Teaching Claude to Play (and Trash Talk)
TBT (2/?): Pipelines on OS/390 ~ TSO Pipelines
TBT (4/?): ToonTalk - Teaching Robots to Program ~ tt-rs
U
Five ML Concepts - #20 ~ uncertainty estimation
Five ML Concepts - #13 ~ universal approximation theorem
music-pipe-rs: Unix Pipelines for MIDI Composition ~ Unix pipes
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ Unix pipes
music-pipe-rs: Unix Pipelines for MIDI Composition
Lucy 20%: Upgrading My Home AI Cluster
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails ~ UWSH
V
Five ML Concepts - #20 ~ VAE
Deepseek Papers (1/3): mHC - Training Stability at Any Depth ~ vanishing gradients
Five ML Concepts - #20 ~ variational autoencoders
How AI Learns Part 4: Memory-Based Learning ~ vector database
TBT (3/?): Vector Graphics Games ~ vector graphics
TBT (3/?): Vector Graphics Games
Cat Finder: Personal Software via Vibe Coding
Cat Finder: Personal Software via Vibe Coding ~ vibe coding
Neural-Net-RS: An Educational Neural Network Platform ~ vibe coding
TBT (2/?): Pipelines on OS/390 ~ vibe coding
midi-cli-rs: Music Generation for AI Coding Agents ~ vibe coding
Cat Finder: Personal Software via Vibe Coding
Small Models (4/6): This AI Has a Visible Brain
Five ML Concepts - #4 ~ vision-language models
TBT (4/?): ToonTalk - Teaching Robots to Program ~ visual programming
Five ML Concepts - #4 ~ VLM
Lucy 20%: Upgrading My Home AI Cluster ~ voice cloning
Lucy 20%: Upgrading My Home AI Cluster ~ VoxCPM
W
Lucy 20%: Upgrading My Home AI Cluster ~ Wan 2.2
Five ML Concepts - #24 ~ warmup
Neural-Net-RS: An Educational Neural Network Platform ~ WASM
RLM: Recursive Language Models for Massive Context ~ WASM
Small Models (5/6): Max AI Per Watt
midi-cli-rs: Music Generation for AI Coding Agents ~ WAV
music-pipe-rs: Web Demo and Multi-Instrument Arrangements ~ web demo
Neural-Net-RS: An Educational Neural Network Platform ~ WebAssembly
RLM: Recursive Language Models for Massive Context ~ WebAssembly
TBT (3/?): Vector Graphics Games ~ WebAssembly
TBT (4/?): ToonTalk - Teaching Robots to Program ~ WebAssembly
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ WebAssembly
music-pipe-rs: Web Demo and Multi-Instrument Arrangements
Five ML Concepts - #17 ~ weight decay
Five ML Concepts - #15 ~ weight initialization
How AI Learns Part 3: Weight-Based Learning
Welcome to Software Wrighter Lab
TBT (3/?): Vector Graphics Games ~ wgpu
Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Small Models (6/6): Which Small AI Fits YOUR Laptop?
Welcome to Software Wrighter Lab
X
Lucy 20%: Upgrading My Home AI Cluster ~ X99 motherboard
Neural-Net-RS: An Educational Neural Network Platform ~ XOR problem
Y
JSON et al: A Deep Dive into Data Serialization Formats ~ YAML
TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing ~ Yew
Cat Finder: Personal Software via Vibe Coding ~ YOLOv8
Welcome to Software Wrighter Lab ~ YouTube

GitHub repositories referenced in posts, sorted alphabetically by URL.

RepositoryBlog Post
https://github.com/softwarewrighter/bdh Small Models (4/6): This AI Has a Visible Brain
https://github.com/softwarewrighter/billion-llm Small Models (5/6): Max AI Per Watt
https://github.com/softwarewrighter/dytopo-rs DyTopo: Dynamic Topology for Multi-Agent AI
https://github.com/softwarewrighter/efficient-llm Small Models (6/6): Which Small AI Fits YOUR Laptop?
https://github.com/softwarewrighter/engram-poc Deepseek Papers (2/3): Engram - Conditional Memory for Transformers
https://github.com/softwarewrighter/engram-poc Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation
https://github.com/softwarewrighter/mHC-poc Deepseek Papers (1/3): mHC - Training Stability at Any Depth
https://github.com/softwarewrighter/many-eyes-learning Many-Eyes Learning: Intrinsic Rewards and Diversity
https://github.com/softwarewrighter/many-eyes-learning Solving Sparse Rewards with Many Eyes
https://github.com/softwarewrighter/midi-cli-rs midi-cli-rs: Extending with Custom Mood Packs
https://github.com/softwarewrighter/midi-cli-rs midi-cli-rs: Music Generation for AI Coding Agents
https://github.com/softwarewrighter/multi-hop-reasoning Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
https://github.com/softwarewrighter/multi-hop-reasoning Multi-Hop Reasoning (2/2): The Distribution Trap
https://github.com/softwarewrighter/music-pipe-rs music-pipe-rs: Unix Pipelines for MIDI Composition
https://github.com/softwarewrighter/music-pipe-rs music-pipe-rs: Web Demo and Multi-Instrument Arrangements
https://github.com/softwarewrighter/pocket-llm Small Models (2/6): AI in Your Pocket
https://github.com/softwarewrighter/rlm-project RLM: Recursive Language Models for Massive Context
https://github.com/softwarewrighter/sleepy-coder Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
https://github.com/softwarewrighter/sleepy-coder Towards Continuous LLM Learning (2): Routing Prevents Forgetting
https://github.com/softwarewrighter/train-trm Small Models (1/6): 976 Parameters Beat Billions
https://github.com/softwarewrighter/vectorcade-games TBT (3/?): Vector Graphics Games
https://github.com/softwarewrighter/viz-hrm-ft Small Models (3/6): Planner + Doer = Genius
https://github.com/sw-comp-history/apl-horse-race TBT (1/?): My First Program Was a Horse Race
https://github.com/sw-comp-history/ibm-1130-rs TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
https://github.com/sw-comp-history/pipelines-rs TBT (2/?): Pipelines on OS/390
https://github.com/sw-fun/tt-rs TBT (4/?): ToonTalk - Teaching Robots to Program
https://github.com/sw-game-dev/game-mcp-poc MCP: Teaching Claude to Play (and Trash Talk)
https://github.com/sw-ml-study/cat-finder Cat Finder: Personal Software via Vibe Coding
https://github.com/sw-ml-study/neural-net-rs Neural-Net-RS: An Educational Neural Network Platform
https://github.com/weagan/Engram Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation

Videos referenced in posts, sorted alphabetically by video title.

VideoBlog Post
90s Pipelines Rust/WASM homage #TBT TBT (2/?): Pipelines on OS/390
976 parameters is more than billions?! Small Models (1/6): 976 Parameters Beat Billions
AI in Your Pocket Small Models (2/6): AI in Your Pocket
Arcade Wireframes: A Vector Story #TBT TBT (3/?): Vector Graphics Games
Five ML Concepts - #1 Five ML Concepts - #1
Five ML Concepts - #21 Five ML Concepts - #21
Five ML Concepts - #22 Five ML Concepts - #22
Five ML Concepts - #23 Five ML Concepts - #23
Five ML Concepts - #24 Five ML Concepts - #24
Five ML Concepts - #25 Five ML Concepts - #25
Five ML Concepts - #26 Five ML Concepts - #26
Five ML Concepts - #27 Five ML Concepts - #27
Five ML Concepts - #28 Five ML Concepts - #28
Five ML Concepts - #29 Five ML Concepts - #29
Five ML Concepts - #2 Five ML Concepts - #2
Five ML Concepts - #3 Five ML Concepts - #3
Five ML Concepts - #4 Five ML Concepts - #4
Five ML Concepts - #5 Five ML Concepts - #5
Five ML Concepts - #6 Five ML Concepts - #6
Five ML Concepts - #7 Five ML Concepts - #7
Five ML Concepts - #8 Five ML Concepts - #8
Five ML Concepts - #9 Five ML Concepts - #9
Five ML Concepts - #10 Five ML Concepts - #10
Five ML Concepts - #11 Five ML Concepts - #11
Five ML Concepts - #12 Five ML Concepts - #12
Five ML Concepts - #13 Five ML Concepts - #13
Five ML Concepts - #14 Five ML Concepts - #14
Five ML Concepts - #15 Five ML Concepts - #15
Five ML Concepts - #16 Five ML Concepts - #16
Five ML Concepts - #17 Five ML Concepts - #17
Five ML Concepts - #18 Five ML Concepts - #18
Five ML Concepts - #19 Five ML Concepts - #19
Five ML Concepts - #20 Five ML Concepts - #20
Given enough eyeballs... Solving Sparse Rewards with Many Eyes
Greek Code, No Lowercase #TBT TBT (1/?): My First Program Was a Horse Race
IBM 1130 System Emulator TBT (5/?): IBM 1130 System Emulator - Experience 1960s Computing
ICL Revisited: From Mystery to Engineering In-Context Learning Revisited: From Mystery to Engineering
JSON or Something Better? JSON et al: A Deep Dive into Data Serialization Formats
LLM Learns While You Sleep Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
LLM with Training Wheels Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
Local Cat Detection in Rust Cat Finder: Personal Software via Vibe Coding
Lucy 20% - AI Cluster Upgrade Lucy 20%: Upgrading My Home AI Cluster
Many-Eyes Learning: Watch AI Scouts Explore Many-Eyes Learning: Intrinsic Rewards and Diversity
Max AI Per Watt Small Models (5/6): Max AI Per Watt
midi-cli-rs: Extending with Custom Mood Packs midi-cli-rs: Extending with Custom Mood Packs
Music tool for AI Agents, Built in Rust midi-cli-rs: Music Generation for AI Coding Agents
music-pipe-rs: Web Demo and Multi-Instrument Arrangements music-pipe-rs: Web Demo and Multi-Instrument Arrangements
Planner + Doer = Genius Small Models (3/6): Planner + Doer = Genius
Recursive Language Model implemented, evaluated, explained RLM: Recursive Language Models for Massive Context
Smarter Agent Communication DyTopo: Dynamic Topology for Multi-Agent AI
This AI Has a Visible Brain Small Models (4/6): This AI Has a Visible Brain
Trash Talkin' Tic Tac Toe MCP: Teaching Claude to Play (and Trash Talk)
Vibe Coding a 90s Classic: ToonTalk in Rust #TBT TBT (4/?): ToonTalk - Teaching Robots to Program
Watch a Neural Network Learn Neural-Net-RS: An Educational Neural Network Platform
Which Small AI Fits YOUR Laptop? Small Models (6/6): Which Small AI Fits YOUR Laptop?
Why Your LLM Memory Needs a Gate Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation

Research papers referenced in posts, sorted alphabetically by title.

PaperBlog Post
A Baseline for Detecting Misclassified and Out-of-Distribution Examples Five ML Concepts - #12
A Comprehensive Survey of Continual Learning How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
A Mathematical Framework for Transformer Circuits Five ML Concepts - #29
A Survey of Loss Functions for Deep Neural Networks Five ML Concepts - #3
A Survey of Quantization Methods for Efficient Neural Network Inference Five ML Concepts - #9
A survey on Image Data Augmentation for Deep Learning Five ML Concepts - #26
A Survey on Transfer Learning Five ML Concepts - #3
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Five ML Concepts - #24
Adam: A Method for Stochastic Optimization Five ML Concepts - #4
Addressing Cold Start in Recommender Systems Five ML Concepts - #14
An Introduction to Variational Autoencoders Five ML Concepts - #26
An overview of gradient descent optimization algorithms Five ML Concepts - #2
Attention Is All You Need Five ML Concepts - #1
Attention Is All You Need Five ML Concepts - #6
Attention Is All You Need Five ML Concepts - #7
Auto-Encoding Variational Bayes Five ML Concepts - #20
Auto-Encoding Variational Bayes Five ML Concepts - #5
Batch Normalization: Accelerating Deep Network Training Five ML Concepts - #16
BERT: Pre-training of Deep Bidirectional Transformers Five ML Concepts - #5
BERT: Pre-training of Deep Bidirectional Transformers Five ML Concepts - #6
Between accurate prediction and poor decision making: the AI/ML gap Five ML Concepts - #21
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Five ML Concepts - #11
Chain-of-Thought Prompting How AI Learns Part 5: Context Engineering & Recursive Reasoning
Concrete Problems in AI Safety Five ML Concepts - #24
Constitutional AI: Harmlessness from AI Feedback Five ML Concepts - #26
Controllable Generation from Pre-trained Language Models Five ML Concepts - #22
Curiosity-driven Exploration by Self-Supervised Prediction Many-Eyes Learning: Intrinsic Rewards and Diversity
Curriculum Learning Five ML Concepts - #19
Cyclical Learning Rates Five ML Concepts - #2
Decoupled Weight Decay Regularization Five ML Concepts - #17
Deep Double Descent: Where Bigger Models and More Data Can Hurt Five ML Concepts - #25
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Five ML Concepts - #8
Denoising Diffusion Probabilistic Models Five ML Concepts - #8
Direct Preference Optimization Five ML Concepts - #2
Direct Preference Optimization How AI Learns Part 1: The Many Meanings of Learning
Direct Preference Optimization How AI Learns Part 3: Weight-Based Learning
Distilling the Knowledge in a Neural Network Five ML Concepts - #10
Distilling the Knowledge in a Neural Network How AI Learns Part 3: Weight-Based Learning
Distribution Shift Five ML Concepts - #18
DyTopo: Dynamic Topology Routing for Multi-Agent Reasoning DyTopo: Dynamic Topology for Multi-Agent AI
Editing Factual Knowledge in Language Models How AI Learns Part 4: Memory-Based Learning
Editing Large Language Models: Problems, Methods, and Opportunities Five ML Concepts - #27
Efficient Large-Scale Language Model Training on GPU Clusters Five ML Concepts - #12
Efficient Transformers: A Survey Five ML Concepts - #18
ELLA: Subspace Learning for Lifelong Machine Learning How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
ELLA: Subspace Learning for Lifelong Machine Learning How AI Learns Part 6: Toward Continuous Learning
Emergent Abilities of Large Language Models Five ML Concepts - #23
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling Five ML Concepts - #21
Engram: Conditional Memory via Scalable Lookup How AI Learns Part 1: The Many Meanings of Learning
Engram: Conditional Memory via Scalable Lookup How AI Learns Part 4: Memory-Based Learning
Engram: Conditional Memory via Scalable Lookup How AI Learns Part 7: Designing a Continuous Learning Agent
Engram: Conditional Memory via Scalable Lookup Deepseek Papers (2/3): Engram - Conditional Memory for Transformers
Engram: Conditional Memory via Scalable Lookup Deepseek Papers (3/3): Engram Revisited - From Emulation to Implementation
Experience Replay for Continual Learning Five ML Concepts - #27
Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift Five ML Concepts - #15
Fast Inference from Transformers via Speculative Decoding Five ML Concepts - #5
Fast Transformer Decoding Five ML Concepts - #8
FlashAttention: Fast and Memory-Efficient Exact Attention Five ML Concepts - #9
FOREVER: Model-Centric Replay How AI Learns Part 6: Toward Continuous Learning
Generative Adversarial Nets Five ML Concepts - #24
Goodhart's Law and Machine Learning: A Structural Perspective Five ML Concepts - #26
GQA: Training Generalized Multi-Query Transformer Models Five ML Concepts - #7
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets Five ML Concepts - #29
Hierarchical Reasoning Model Small Models (3/6): Planner + Doer = Genius
ImageNet Classification with Deep Convolutional Neural Networks Five ML Concepts - #10
Improving Interactive In-Context Learning from Natural Language Feedback In-Context Learning Revisited: From Mystery to Engineering
Intriguing properties of neural networks Five ML Concepts - #25
IRPO: Intrinsic Reward Policy Optimization Many-Eyes Learning: Intrinsic Rewards and Diversity
IRPO Solving Sparse Rewards with Many Eyes
Jailbroken: How Does LLM Safety Training Fail? Five ML Concepts - #21
KG-Guided RAG (arXiv) Multi-Hop Reasoning (1/2): Training Wheels for Small LLMs
KG-Guided RAG (arXiv) Multi-Hop Reasoning (2/2): The Distribution Trap
Language Models are Few-Shot Learners Five ML Concepts - #10
Language Models are Few-Shot Learners Five ML Concepts - #5
Language Models are Few-Shot Learners Five ML Concepts - #6
Language Models are Few-Shot Learners In-Context Learning Revisited: From Mystery to Engineering
Leakage in Data Mining: Formulation, Detection, and Avoidance Five ML Concepts - #24
Learning to summarize from human feedback Five ML Concepts - #18
Learning Transferable Visual Models (CLIP) Five ML Concepts - #4
Long Short-Term Memory Five ML Concepts - #22
LoRA: Low-Rank Adaptation of Large Language Models Five ML Concepts - #3
LoRA: Low-Rank Adaptation of Large Language Models How AI Learns Part 1: The Many Meanings of Learning
LoRA: Low-Rank Adaptation of Large Language Models How AI Learns Part 3: Weight-Based Learning
LoRA: Low-Rank Adaptation of Large Language Models How AI Learns Part 7: Designing a Continuous Learning Agent
LoRA Small Models (5/6): Max AI Per Watt
Mamba: Linear-Time Sequence Modeling Five ML Concepts - #1
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Five ML Concepts - #28
mHC: Manifold-Constrained Hyper-Connections Deepseek Papers (1/3): mHC - Training Stability at Any Depth
Mixed Precision Training Five ML Concepts - #8
MobileLLM (ICML 2024) Small Models (2/6): AI in Your Pocket
Neural Machine Translation by Jointly Learning to Align and Translate Five ML Concepts - #2
Neural Machine Translation of Rare Words with Subword Units Five ML Concepts - #3
Neural Turing Machines Five ML Concepts - #27
On Calibration of Modern Neural Networks Five ML Concepts - #13
On Calibration of Modern Neural Networks Five ML Concepts - #16
On Calibration of Modern Neural Networks Five ML Concepts - #25
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima Five ML Concepts - #23
On Large-Batch Training for Deep Learning Five ML Concepts - #12
On the Difficulty of Training Recurrent Neural Networks Five ML Concepts - #14
On the Properties of Neural Machine Translation Five ML Concepts - #2
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer Five ML Concepts - #11
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer Five ML Concepts - #27
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer Five ML Concepts - #28
Overcoming catastrophic forgetting in neural networks (EWC) Five ML Concepts - #27
Overcoming Catastrophic Forgetting in Neural Networks Five ML Concepts - #15
Overcoming Catastrophic Forgetting in Neural Networks How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
Overcoming Catastrophic Forgetting in Neural Networks How AI Learns Part 6: Toward Continuous Learning
Parameter-Efficient Transfer Learning for NLP How AI Learns Part 3: Weight-Based Learning
Pathway (Sparse Coding) Small Models (4/6): This AI Has a Visible Brain
Prevalence of Neural Collapse during the terminal phase of deep learning training Five ML Concepts - #29
Prompt Injection attack against LLM-integrated Applications Five ML Concepts - #21
Reagent: Reasoning Reward Models for Agents Many-Eyes Learning: Intrinsic Rewards and Diversity
Reagent Solving Sparse Rewards with Many Eyes
REALM: Retrieval-Augmented Language Model Pre-Training How AI Learns Part 4: Memory-Based Learning
Recursive Language Models How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
Recursive Language Models How AI Learns Part 5: Context Engineering & Recursive Reasoning
Recursive Language Models How AI Learns Part 7: Designing a Continuous Learning Agent
Recursive Language Models RLM: Recursive Language Models for Massive Context
Relational Inductive Biases, Deep Learning, and Graph Networks Five ML Concepts - #12
Representation Learning: A Review and New Perspectives Five ML Concepts - #25
Rethinking the Inception Architecture for Computer Vision Five ML Concepts - #17
Rethinking the Inception Architecture for Computer Vision Five ML Concepts - #25
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Five ML Concepts - #10
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks How AI Learns Part 1: The Many Meanings of Learning
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks How AI Learns Part 4: Memory-Based Learning
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks How AI Learns Part 7: Designing a Continuous Learning Agent
RoFormer: Enhanced Transformer with Rotary Position Embedding Five ML Concepts - #6
Scaling Laws for Neural Language Models Five ML Concepts - #17
Scaling Relationship on Learning Mathematical Reasoning with LLMs Five ML Concepts - #22
Sequence to Sequence Learning with Neural Networks Five ML Concepts - #10
SGDR: Stochastic Gradient Descent with Warm Restarts Five ML Concepts - #23
Share: Shared LoRA Subspaces for Continual Learning How AI Learns Part 2: Catastrophic Forgetting vs Context Rot
Share: Shared LoRA Subspaces for Continual Learning How AI Learns Part 6: Toward Continuous Learning
Share: Shared LoRA Subspaces for Continual Learning How AI Learns Part 7: Designing a Continuous Learning Agent
Share: Shared LoRA Subspaces for Continual Learning Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
Share: Shared LoRA Subspaces for Continual Learning Towards Continuous LLM Learning (2): Routing Prevents Forgetting
Sharpness-Aware Minimization for Efficiently Improving Generalization Five ML Concepts - #29
Shortcut Learning in Deep Neural Networks Five ML Concepts - #13
Speculative Decoding Paper Small Models (5/6): Max AI Per Watt
Stochastic Gradient Descent as Approximate Bayesian Inference Five ML Concepts - #20
Survey of Hallucination in NLG Five ML Concepts - #1
Test-Time Training for Language Models How AI Learns Part 5: Context Engineering & Recursive Reasoning
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks Five ML Concepts - #28
The Universal Weight Subspace Hypothesis Towards Continuous LLM Learning (1): Sleepy Coder - When Fine-Tuning Fails
The Universal Weight Subspace Hypothesis Towards Continuous LLM Learning (2): Routing Prevents Forgetting
The Unreasonable Effectiveness of Data Five ML Concepts - #22
Tiny Recursive Model Small Models (1/6): 976 Parameters Beat Billions
Toolformer: Language Models Can Teach Themselves to Use Tools Five ML Concepts - #23
Towards A Rigorous Science of Interpretable Machine Learning Five ML Concepts - #20
Training Compute-Optimal Large Language Models (Chinchilla) Five ML Concepts - #28
Training Deep Nets with Sublinear Memory Cost Five ML Concepts - #13
Training Language Models to Follow Instructions with Human Feedback How AI Learns Part 3: Weight-Based Learning
Training language models to follow instructions with human feedback Five ML Concepts - #9
Transformers Learn In-Context by Gradient Descent In-Context Learning Revisited: From Mystery to Engineering
Understanding Deep Learning Requires Rethinking Generalization Five ML Concepts - #16
Understanding the Difficulty of Training Deep Feedforward Neural Networks Five ML Concepts - #15
Understanding the Difficulty of Training Deep Feedforward Neural Networks Neural-Net-RS: An Educational Neural Network Platform
Visualizing the Loss Landscape of Neural Nets Five ML Concepts - #14
What Can Transformers Learn In-Context? How AI Learns Part 1: The Many Meanings of Learning
What Can Transformers Learn In-Context? How AI Learns Part 5: Context Engineering & Recursive Reasoning
What Explains In-Context Learning in Transformers? In-Context Learning Revisited: From Mystery to Engineering
What Uncertainties Do We Need in Bayesian Deep Learning? Five ML Concepts - #20
Word2Vec Five ML Concepts - #1