Harness & Weights — A Visual Explainer

01 / Weights

The numbers that hold knowledge

A neural network is, at its core, a large pile of floating-point numbers — the weights. They live on the connections between nodes. When input flows in, every weight multiplies and scales the signal passing through it. The shape of those numbers is what the model has learned.

Think of each weight as a dial. Turn it up and that signal path becomes loud. Turn it toward zero and the path goes quiet. A negative weight flips the signal.

Network signal flow — select a mode

Forward pass: a signal (colored bead) travels from each input, multiplied at every connection, accumulating at each node before reaching the output.

02 / Weights as dials

Tune the weights, change the answer

Below is a tiny 3-input, 1-output network. Drag any input weight and watch the prediction shift in real time. This is exactly what gradient descent does — but automatically, across millions of weights at once.

Input weights

w₁ +1.20

w₂ −0.50

w₃ +0.80

net = σ(w₁·x₁ + w₂·x₂ + w₃·x₃)

bias +0.10

Prediction

73%

class A (positive)

class A

class B

Insight: flip w₁ to a large negative value while w₃ is strongly positive — the model becomes uncertain. This is the same tension a trained model resolves by finding weights that minimise error across thousands of examples.

03 / Harness

The system that shapes the weights

A harness is the scaffolding that controls the training process — the data pipelines, loss functions, optimisers, and evaluation loops that steer a network toward useful weight values.

Without a harness, you have random weights and noise. With a well-designed harness, those same weights converge to something that can recognise cats, translate sentences, or write code.

Training loop — step through the cycle

01

→

Forward pass

Input flows through the network, weights multiply each signal.

02

△

Loss

How wrong was the output? Loss measures the gap from truth.

03

←

Backward pass

Gradients flow back — each weight learns its share of blame.

04

↺

Update

Optimiser nudges every weight slightly in the right direction.

Epoch 0, step 0

Loss over steps

Weight matrix (4×4 hidden)

negative

positive

04 / Anatomy of a harness

What the harness contains

A production training harness is more than a loop. It includes the tools that keep training stable, reproducible, and measurable.

DATA PIPELINE

Feeds the network

Shuffles, batches, augments. Keeps GPUs fed and prevents the network from memorising the order examples arrive.

LOSS FUNCTION

Defines "wrong"

Cross-entropy for classification, MSE for regression, RLHF reward for alignment. The loss is a needle pointing at how to improve.

OPTIMISER

Moves the weights

Adam, SGD, Adafactor. Uses gradients and momentum to decide how large each weight update should be.

CHECKPOINTING

Saves the state

Snapshots of weights at intervals. If training diverges, you roll back. The checkpoint is the model — it ships weights, not code.

The key insight: the harness is transient — it exists only during training. What it produces, the weights, is permanent. When you download a model or call an API, you're using nothing but the final weight values that the harness shaped.