Skip to main content

Camera Trajectory

Key Insight

To move from "something moves in the clip" to "the camera pans left and zooms in", a video model has to be told the camera's path explicitly. This project adds camera control to an image-to-video model by feeding it Plücker-coordinate camera embeddings — a compact six-number description of the ray each pixel looks along, computed per frame from the desired camera trajectory — and then verifies the model honors requests to pan and zoom. Encoding the camera as per-pixel rays rather than raw position numbers is what lets the model generalize to trajectories it never saw in training, because every pixel then carries a direct geometric hint about where its content should appear to come from.