Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory

Jinzhuo Liu¹, Jiangning Zhang^1✉, Wencan Jiang¹, Yabiao Wang², Dingkang Liang³, Zhucun Xue¹, Ran Yi⁴, Yong Liu¹

¹Zhejiang University, ²Tencent Youtu Lab, ³Huazhong University of Science and Technology,
⁴Shanghai Jiao Tong University ^✉Corresponding author

🔥 Updates

[2026.05.15]: We release the github repo, the project page, the quantized model checkpoints, the NarraStream-Bench, and the paper.

📷 Introduction

💡TL;DR: IAMFlow uses explicit identity-aware memory to keep identities consistent across evolving narrative prompts, achieving faster and stronger long video generation on NarraStream-Bench.

✨ Highlights

We introduce IAMFlow, a training-free identity-aware memory framework that explicitly organizes historical information around persistent entities and attributes, enabling reliable identity preservation across evolving prompt transitions.
We design a systematic inference acceleration pipeline to make the framework computationally practical, combining asynchronous visual verification, adaptive prompt transition, and model quantization to preserve long-term consistency without sacrificing generation speed.
We introduce NarraStream-Bench, a modern benchmark suite for assessing long-term consistency in narrative streaming video generation. Extensive experiments and ablation studies demonstrate that IAMFlow achieves superior performance across various metrics while enabling more efficient inference.

🛠️ Installation

1. Install Requirements

git clone git@github.com:Eddie0521/IAMFlow.git
cd IAMFlow
conda create -n iamflow python=3.12 -y
conda activate iamflow

# Install PyTorch first according to your CUDA environment.
python -m pip install torch==2.9.1 torchvision==0.24.1
python -m pip install -r requirements.txt
pip install flash-attn --no-build-isolation

2. Download Checkpoints

Download models using hf:

pip install "huggingface_hub[cli]"
hf download Wan-AI/Wan2.1-T2V-1.3B --local-dir pretrained/Wan2.1-T2V-1.3B
hf download Eddie0521/IAMFlow --local-dir pretrained/iamflow_models
hf download Qwen/Qwen3-VL-2B-Instruct --local-dir pretrained/Qwen3-VL-2B-Instruct
hf download Qwen/Qwen3-4B-Instruct-2507 --local-dir pretrained/Qwen3-4B-Instruct-2507

🔑 Inference

We deploy DiT, TextEncoder, and LLM on one GPU, while VAE and VLM are deployed on another GPU.

bash ./scripts/run_iamflow.sh

📏 Evaluation & Benchmark

See the NarraStream-Bench.

🤗 Acknowledgement

MemFlow: the codebase we built upon. Thanks for their wonderful work.
Self-Forcing: the algorithm we built upon. Thanks for their wonderful work.
Wan: the base model we built upon. Thanks for their wonderful work.

🌟 Citation

Please leave us a star 🌟 and cite our paper if you find our work helpful.

@misc{liu2026advancingnarrativelongvideo,
      title={Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory}, 
      author={Jinzhuo Liu and Jiangning Zhang and Wencan Jiang and Yabiao Wang and Dingkang Liang and Zhucun Xue and Ran Yi and Yong Liu},
      year={2026},
      eprint={2605.18733},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.18733}, 
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Eddie0521/IAMFlow

Advancing Narrative Long Video Generation via Training-Free Identity-Aware Memory

Paper • 2605.18733 • Published 2 days ago