ImageWAM-FLUX.2-9B-LIBERO
This repository contains the ImageWAM FLUX.2 9B checkpoint for LIBERO from ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?
ImageWAM is a family of world action models built on image-editing foundation models. This checkpoint is intended for evaluation and research use with the accompanying ImageWAM codebase.
Model Details
- Model family: ImageWAM
- Image-editing backbone: FLUX.2 [klein] base
- Variant: FLUX.2 klein-base-9B
- Benchmark: LIBERO
- Training code: yuyangalin/ImageWAM
- Base model weights: Users must separately prepare the FLUX.2 klein-base-9B weights and FLUX.2 autoencoder as described in the ImageWAM README.
Files
Expected file layout:
.
βββ model.pt
βββ dataset_stats.json
βββ config.yaml
model.pt: ImageWAM checkpoint used by the evaluation scripts.dataset_stats.json: normalization statistics required for policy evaluation.config.yaml: original training configuration for provenance and reproducibility.
Usage
Install and prepare the ImageWAM repository following the project README. Then download this model repository:
mkdir -p checkpoints/imagewam_release/libero/flux2_klein_9b
huggingface-cli download yuyangalin/ImageWAM-FLUX.2-9B-LIBERO \
--repo-type model \
--local-dir checkpoints/imagewam_release/libero/flux2_klein_9b
Prepare FLUX.2 9B weights and set:
export FLUX2_VARIANT=9b
export FLUX2_MODEL_PATH=/path/to/flux-2-klein-base-9b.safetensors
export FLUX2_AE_MODEL_PATH=/path/to/ae.safetensors
export FLUX2_QWEN3_MODEL_SPEC=Qwen/Qwen3-8B
Evaluate on LIBERO:
export CKPT_PATH="$(pwd)/checkpoints/imagewam_release/libero/flux2_klein_9b/model.pt"
export DATASET_STATS_PATH="$(pwd)/checkpoints/imagewam_release/libero/flux2_klein_9b/dataset_stats.json"
NUM_GPUS=8 FLUX2_VARIANT=9b bash scripts/flux2/run_eval_flux2_libero.sh
Intended Use
This checkpoint is intended for:
- Reproducing ImageWAM LIBERO evaluations.
- Research on robot policy learning, world action models, and image-editing-based action generation.
- Comparison against other LIBERO policy models under the same evaluation setup.
This checkpoint is not intended for safety-critical or real-world robot deployment without additional validation.
Limitations
- Evaluation requires the ImageWAM codebase and the LIBERO benchmark environment.
- The checkpoint assumes the same model variant and configuration used during training. See
train_config.yaml. - Users must separately prepare the matching FLUX.2 9B base model and autoencoder weights.
- Performance may differ if the simulator version, dataset preprocessing, action normalization statistics, or evaluation settings differ from the release setup.
- The 9B variant has higher GPU memory requirements than the 4B variant.
Citation
If you use this checkpoint, please cite the ImageWAM paper:
@misc{zhang2026imagewam,
title={ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing?},
author={Yuyang Zhang and Wenyao Zhang and Zekun Qi and He Zhang and Haitao Lin and Jingbo Zhang and Yao Mu and Xiaokang Yang and Wenjun Zeng and Xin Jin},
year={2026},
eprint={2606.19531},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2606.19531},
}
Acknowledgements
ImageWAM builds on several open-source projects and model families, including FLUX.2, FastWAM, LIBERO, LIBERO-plus, and RoboTwin. Please also follow the licenses and citation requirements of the corresponding upstream projects.
- Downloads last month
- -