The two essential ideas behind every trained model — what the numbers are, how they're shaped, and what keeps the whole process from flying apart.
A neural network is, at its core, a large pile of floating-point numbers — the weights. They live on the connections between nodes. When input flows in, every weight multiplies and scales the signal passing through it. The shape of those numbers is what the model has learned.
Think of each weight as a dial. Turn it up and that signal path becomes loud. Turn it toward zero and the path goes quiet. A negative weight flips the signal.
Forward pass: a signal (colored bead) travels from each input, multiplied at every connection, accumulating at each node before reaching the output.
Below is a tiny 3-input, 1-output network. Drag any input weight and watch the prediction shift in real time. This is exactly what gradient descent does — but automatically, across millions of weights at once.
Insight: flip w₁ to a large negative value while w₃ is strongly positive — the model becomes uncertain. This is the same tension a trained model resolves by finding weights that minimise error across thousands of examples.
A harness is the scaffolding that controls the training process — the data pipelines, loss functions, optimisers, and evaluation loops that steer a network toward useful weight values.
Without a harness, you have random weights and noise. With a well-designed harness, those same weights converge to something that can recognise cats, translate sentences, or write code.
A production training harness is more than a loop. It includes the tools that keep training stable, reproducible, and measurable.
The key insight: the harness is transient — it exists only during training. What it produces, the weights, is permanent. When you download a model or call an API, you're using nothing but the final weight values that the harness shaped.