YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

logo

Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory

Jinzhuo Liu1, Jiangning Zhang1βœ‰, Wencan Jiang1, Yabiao Wang2, Dingkang Liang3, Zhucun Xue1, Ran Yi4, Yong Liu1

1Zhejiang University,    2Tencent Youtu Lab,    3Huazhong University of Science and Technology,
4Shanghai Jiao Tong University    βœ‰Corresponding author

   

πŸ”₯ Updates

πŸ“· Introduction

πŸ’‘TL;DR: IAMFlow uses explicit identity-aware memory to keep identities consistent across evolving narrative prompts, achieving faster and stronger long video generation on NarraStream-Bench.

✨ Highlights

  1. We introduce IAMFlow, a training-free identity-aware memory framework that explicitly organizes historical information around persistent entities and attributes, enabling reliable identity preservation across evolving prompt transitions.
  2. We design a systematic inference acceleration pipeline to make the framework computationally practical, combining asynchronous visual verification, adaptive prompt transition, and model quantization to preserve long-term consistency without sacrificing generation speed.
  3. We introduce NarraStream-Bench, a modern benchmark suite for assessing long-term consistency in narrative streaming video generation. Extensive experiments and ablation studies demonstrate that IAMFlow achieves superior performance across various metrics while enabling more efficient inference.

πŸ› οΈ Installation

1. Install Requirements

git clone git@github.com:Eddie0521/IAMFlow.git
cd IAMFlow
conda create -n iamflow python=3.12 -y
conda activate iamflow

# Install PyTorch first according to your CUDA environment.
python -m pip install torch==2.9.1 torchvision==0.24.1
python -m pip install -r requirements.txt
pip install flash-attn --no-build-isolation

2. Download Checkpoints

Download models using hf:

pip install "huggingface_hub[cli]"
hf download Wan-AI/Wan2.1-T2V-1.3B --local-dir pretrained/Wan2.1-T2V-1.3B
hf download Eddie0521/IAMFlow --local-dir pretrained/iamflow_models
hf download Qwen/Qwen3-VL-2B-Instruct --local-dir pretrained/Qwen3-VL-2B-Instruct
hf download Qwen/Qwen3-4B-Instruct-2507 --local-dir pretrained/Qwen3-4B-Instruct-2507

πŸ”‘ Inference

We deploy DiT, TextEncoder, and LLM on one GPU, while VAE and VLM are deployed on another GPU.

bash ./scripts/run_iamflow.sh

πŸ“ Evaluation & Benchmark

See the NarraStream-Bench.

πŸ€— Acknowledgement

  • MemFlow: the codebase we built upon. Thanks for their wonderful work.
  • Self-Forcing: the algorithm we built upon. Thanks for their wonderful work.
  • Wan: the base model we built upon. Thanks for their wonderful work.

🌟 Citation

Please leave us a star 🌟 and cite our paper if you find our work helpful.

@misc{liu2026advancingnarrativelongvideo,
      title={Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory}, 
      author={Jinzhuo Liu and Jiangning Zhang and Wencan Jiang and Yabiao Wang and Dingkang Liang and Zhucun Xue and Ran Yi and Yong Liu},
      year={2026},
      eprint={2605.18733},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.18733}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for Eddie0521/IAMFlow