TIGER-Lab/VisPhyBench-Data
Updated • 72
An execution-based framework that evaluates physical reasoning by generating executable simulator code from visual inputs.
Note Generated videos for the VisPhyBench sub split, organized by engine and model.
Compare ground-truth and generated videos sample by sample
Note Interactive sample-level comparator for GT vs generated videos across engines and models.