Five ML Concepts - #26
424 words • 3 min read • Abstract

5 machine learning concepts. Under 30 seconds each.
| Resource | Link |
|---|---|
| Papers | Links in References section |
| Video | Five ML Concepts #26![]() |
References
| Concept | Reference |
|---|---|
| Data Augmentation | A survey on Image Data Augmentation (Shorten & Khoshgoftaar 2019) |
| Caching Strategies | Systems engineering practice (no canonical paper) |
| Constitutional AI | Constitutional AI: Harmlessness from AI Feedback (Bai et al. 2022) |
| Goodhart’s Law | Goodhart’s Law and Machine Learning (Sevilla et al. 2022) |
| Manifold Hypothesis | An Introduction to Variational Autoencoders (Kingma & Welling 2019) |
Today’s Five
1. Data Augmentation
Creating additional training examples using label-preserving transformations. Rotate, flip, crop, or color-shift images without changing what they represent.
Effectively increases dataset size and improves generalization.
Like practicing piano pieces at different tempos to build flexibility.
2. Caching Strategies
Storing previous computation results to reduce repeated work and latency. Cache embeddings, KV states, or frequently requested outputs.
Essential for production inference at scale.
Like keeping frequently used books on your desk instead of the library.
3. Constitutional AI
Training models to follow explicit written principles alongside other alignment methods. The constitution provides clear rules for behavior.
Models critique and revise their own outputs against these principles.
Like giving someone written house rules instead of vague instructions.
4. Goodhart’s Law
When a measure becomes a target, it can stop being a good measure. Optimizing for a proxy metric can diverge from the true objective.
A core challenge in reward modeling and evaluation design.
Like studying only for the test instead of learning the subject.
5. Manifold Hypothesis
The idea that real-world data lies on lower-dimensional structures within high-dimensional space. Images of faces don’t fill all possible pixel combinations.
This structure is what representation learning exploits.
Like faces varying along a few key features instead of every pixel independently.
Quick Reference
| Concept | One-liner |
|---|---|
| Data Augmentation | Expanding training data with transformations |
| Caching Strategies | Reducing latency by reusing computation |
| Constitutional AI | Training models to follow explicit principles |
| Goodhart’s Law | Optimizing metrics distorts objectives |
| Manifold Hypothesis | Data lies on lower-dimensional structures |
Short, accurate ML explainers. Follow for more.
Part 26 of the Five ML Concepts series. View all parts | Next: Part 27 →
