Five ML Concepts - #12 | Software Wrighter Lab Blog

Five ML concepts in under 30 seconds each: Precision vs Recall (correct positives vs finding all), OOD Inputs (data unlike training), Batch Size (examples per update), Inductive Bias (built-in assumptions), Latency vs Throughput (speed vs capacity).

5 machine learning concepts. Under 30 seconds each.

Resource	Link
Papers	Links in References section
Video	Five ML Concepts #12
Comments	Discord

References

Concept	Reference
Precision/Recall	The Truth of the F-Measure (Sasaki 2007)
OOD Detection	A Baseline for Detecting Misclassified and Out-of-Distribution Examples (Hendrycks & Gimpel 2017)
Batch Size	On Large-Batch Training for Deep Learning (Goyal et al. 2017)
Inductive Bias	Relational Inductive Biases, Deep Learning, and Graph Networks (Battaglia et al. 2018)
Latency/Throughput	Efficient Large-Scale Language Model Training on GPU Clusters (Narayanan et al. 2021)

Today’s Five

1. Precision vs Recall

Precision measures how often positive predictions are correct. Recall measures how many actual positives are successfully found. Improving one often reduces the other.

The tradeoff depends on your application: spam filters favor precision, medical screening favors recall.

Like a search party: you can find everyone but raise false alarms, or be very certain and miss some people.

2. OOD Inputs (Out-of-Distribution)

Data that differs significantly from what the model saw during training. Models may fail silently or produce confident but wrong answers.

Detecting OOD inputs is an active research area for safer AI deployment.

Like asking a chef trained only in Italian food to make sushi.

3. Batch Size

The number of training examples processed before updating model weights. Larger batches can be more efficient computationally, but may generalize worse.

Finding the right batch size involves balancing speed, memory, and model quality.

Like grading tests one at a time or waiting to grade a full stack.

4. Inductive Bias

The assumptions built into a model that guide how it learns from data. Without inductive bias, models cannot generalize beyond training examples.

CNNs assume spatial locality; transformers assume tokens can attend to any position.

Like expecting nearby houses to have similar prices before looking at the data.

5. Latency vs Throughput

Latency is how long a single request takes. Throughput is how many requests can be handled per second. Optimizing one often comes at the expense of the other.

Batching improves throughput but increases latency for individual requests.

Like a restaurant serving one table quickly or many tables at once.

Quick Reference

Concept	One-liner
Precision vs Recall	Correct positives vs finding all positives
OOD Inputs	Data unlike training distribution
Batch Size	Examples per weight update
Inductive Bias	Built-in learning assumptions
Latency vs Throughput	Speed per request vs total capacity

Short, accurate ML explainers. Follow for more.