Skip to main content

Right-Sizing Experiment


"Don't pay for a 70B answer when an 8B one passes the same test."


Key Insight

Right-sizing means picking the smallest model that still does the job well. This project runs the same workload on a 7B, 13B, and 70B model, measures quality with a real eval and the cost per million tokens of each, and recommends a tier.

Why This Matters

Teams routinely over-serve, paying for a giant model when a smaller fine-tuned one would clear the same bar at a fraction of the cost. Measuring quality and cost side by side turns model choice from a guess into a decision backed by numbers.