EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

This repository contains the policy checkpoint/assets associated with EBench, presented in the paper EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies.

Project Page: EBench Home
Documentation: EBench Docs
GitHub Repository: InternRobotics/EBench

Introduction

EBench is an indoor Vision-Language-Action (VLA) manipulation benchmark built on NVIDIA Isaac Sim. Instead of compressing a model's behavior into a single overall success rate, it produces a multi-axis capability profile that exposes what a model is good at and where it overfits. It diagnoses generalist mobile manipulation policies across 26 diverse and challenging tasks annotated along 5 capability dimensions and 4 generalization dimensions.

Citation

@misc{ebench2026,
  title  = {EBench: Elemental Mobile Manipulation Benchmark},
  author = {Shanghai AI Laboratory},
  year   = {2026},
  note   = {Preprint coming soon},
  url    = {https://internrobotics.github.io/EBench-doc/}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Robotics

Paper for william-g/pi05-ebench-generalist

EBench: Elemental Diagnosis of Generalist Mobile Manipulation Policies

Paper • 2606.18239 • Published 6 days ago • 15