Skip to main content

Textual Inversion

Key Insight

Textual Inversion takes the opposite tack from DreamBooth: it changes nothing in the diffusion model itself and instead learns a single new word embedding — a fresh row added to the text encoder's embedding matrix — that points at your subject. Because only that one vector is trained, the result is a few kilobytes, the smallest personalization artifact there is. The catch is capacity: a single vector can capture a recognizable "vibe" but cannot match the fidelity of LoRA or DreamBooth, because the frozen weights can only render what the model already knows how to draw.