YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory
Jinzhuo Liu1, Jiangning Zhang1β, Wencan Jiang1, Yabiao Wang2, Dingkang Liang3, Zhucun Xue1, Ran Yi4, Yong Liu1
1Zhejiang University,
2Tencent Youtu Lab,
3Huazhong University of Science and Technology,
4Shanghai Jiao Tong University
βCorresponding author
π₯ Updates
- [2026.05.15]: We release the github repo, the project page, the quantized model checkpoints, the NarraStream-Bench, and the paper.
π· Introduction
π‘TL;DR: IAMFlow uses explicit identity-aware memory to keep identities consistent across evolving narrative prompts, achieving faster and stronger long video generation on NarraStream-Bench.
β¨ Highlights
- We introduce IAMFlow, a training-free identity-aware memory framework that explicitly organizes historical information around persistent entities and attributes, enabling reliable identity preservation across evolving prompt transitions.
- We design a systematic inference acceleration pipeline to make the framework computationally practical, combining asynchronous visual verification, adaptive prompt transition, and model quantization to preserve long-term consistency without sacrificing generation speed.
- We introduce NarraStream-Bench, a modern benchmark suite for assessing long-term consistency in narrative streaming video generation. Extensive experiments and ablation studies demonstrate that IAMFlow achieves superior performance across various metrics while enabling more efficient inference.
π οΈ Installation
1. Install Requirements
git clone git@github.com:Eddie0521/IAMFlow.git
cd IAMFlow
conda create -n iamflow python=3.12 -y
conda activate iamflow
# Install PyTorch first according to your CUDA environment.
python -m pip install torch==2.9.1 torchvision==0.24.1
python -m pip install -r requirements.txt
pip install flash-attn --no-build-isolation
2. Download Checkpoints
Download models using hf:
pip install "huggingface_hub[cli]"
hf download Wan-AI/Wan2.1-T2V-1.3B --local-dir pretrained/Wan2.1-T2V-1.3B
hf download Eddie0521/IAMFlow --local-dir pretrained/iamflow_models
hf download Qwen/Qwen3-VL-2B-Instruct --local-dir pretrained/Qwen3-VL-2B-Instruct
hf download Qwen/Qwen3-4B-Instruct-2507 --local-dir pretrained/Qwen3-4B-Instruct-2507
π Inference
We deploy DiT, TextEncoder, and LLM on one GPU, while VAE and VLM are deployed on another GPU.
bash ./scripts/run_iamflow.sh
π Evaluation & Benchmark
See the NarraStream-Bench.
π€ Acknowledgement
- MemFlow: the codebase we built upon. Thanks for their wonderful work.
- Self-Forcing: the algorithm we built upon. Thanks for their wonderful work.
- Wan: the base model we built upon. Thanks for their wonderful work.
π Citation
Please leave us a star π and cite our paper if you find our work helpful.
@misc{liu2026advancingnarrativelongvideo,
title={Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory},
author={Jinzhuo Liu and Jiangning Zhang and Wencan Jiang and Yabiao Wang and Dingkang Liang and Zhucun Xue and Ran Yi and Yong Liu},
year={2026},
eprint={2605.18733},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.18733},
}