Static Quantization (PTQ)

Measure the activations once, fix the scales, and skip the per-batch guesswork.

Key Insight

Static quantization (PTQ) converts both weights and activations to int8 ahead of time. To pick the right activation scales, it first runs a few sample batches through the model — a step called calibration.

Why This Matters

Because the scales are fixed before serving, static quantization avoids the per-batch overhead of dynamic quantization and is usually faster, especially for a CNN. The cost is the extra calibration step and a little more accuracy tuning.

Key Insight​

Why This Matters​

Key Insight

Why This Matters