Human Universal Grasping
Paper โข 2606.17054 โข Published
HUG is a flow-matching model that generates diverse human grasps for any user-specified object in a single RGB-D image. By learning from a large-scale egocentric dataset of human grasps (1M-HUGs), the model can predict human-like grasps that can be retargeted to various robot hands for zero-shot manipulation.
The codebase is tested on Ubuntu 22.04/24.04, CUDA 12.8, PyTorch 2.9.1, and Python 3.10.
# 1) Environment setup
conda env create -f environment.yaml && conda activate hug
pip install torch==2.9.1 torchvision==0.24.1 torchaudio==2.9.1 --index-url https://download.pytorch.org/whl/cu128
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.9.1+cu128.html
pip install --no-build-isolation git+https://github.com/mattloper/chumpy.git@580566e
pip install -e .
Please refer to the official repository for instructions on downloading required assets like MANO models.
Download the full model weights (.safetensors) using the huggingface-cli:
hf download kevinywu/hug hug_full.safetensors --local-dir checkpoints/
HUG predicts human grasps in MANO form. You can run the interactive application to predict grasps for objects in the camera frame:
CKPT=checkpoints/hug_full.safetensors
DATA=data/hug_bench/
# Launch the app: click an object to predict a grasp
python -m hug.app --checkpoint-path "$CKPT" --dataset-path "$DATA" --save-pred
# Visualize saved predictions
python -m hug.visualize_predictions --dataset-path "$DATA"
@article{wu2026hug,
title={Human Universal Grasping},
author={Kevin Yuanbo Wu and Tianxing Zhou and Isaac Tu and Billy Yan and Irmak Guzey and David Fouhey and Dandan Shan and Lerrel Pinto},
journal={arXiv preprint arXiv:2606.17054},
year={2026}
}