Cost Report
"Every token has a price tag — a serving system is only a business once you know it."
Key Insight
This project produces a defensible cost per million tokens for a serving stack — GPU hourly price divided by the tokens it produces per hour — and identifies the three biggest line items driving that number.
Why This Matters
Cost per million tokens is the universal unit that decides whether a serving system makes economic sense. Being able to compute and defend it lets you compare engines and hardware fairly and commit to a price with confidence.