Modality Ratio Sweep
Key Insight
This project treats the per-modality loss curve as a measuring instrument: you deliberately turn one knob — the sampling ratio, how often each modality's examples show up in the data mix — and watch each modality's loss fall (or flat-line) on its own separate curve. Sweeping that knob from one extreme to the other lets you see the cause and effect directly: starve a modality and its curve stalls; feed it more and the curve drops. The diagnostic exposes a trap that an averaged loss hides — a single blended "multimodal loss" can look perfectly healthy while one modality the model is silently ignoring sits stuck near its starting value, because the modalities that are learning pull the average down and mask the one that isn't. Where the Phase 7 Modality Balancing project applies the remedy (oversampling the rare modality or up-weighting its loss), this one is the measurement that tells you whether the remedy is needed and by how much: modality balancing here is purely a data-pipeline knob — the sampling rate (or loss weight) you pick so every per-modality curve descends at a comparable pace.