asgard-robot/asgard_training_data_condiment
Viewer • Updated • 31.6k • 10
How to use asgard-robot/groot-condiment-handover with LeRobot:
This is a fine-tuned NVIDIA GR00T N1.5-3B model specifically trained for condiment handover tasks.
Frozen (Not Trained):
tune_llm=false) - Language model kept frozentune_visual=false) - Visual features frozenTrainable Components:
tune_diffusion_model=true) - Action generationtune_projector=true) - Vision-language to action mapping| Parameter | Value | Description |
|---|---|---|
| Dataset Repository | asgard-robot/asgard_training_data_condiment | Hugging Face dataset |
| Dataset Version | v3.0 | LeRobot format tag |
| Total Episodes | 40 | Number of demonstrations |
| Total Frames | 31,522 | Total training samples |
| Avg Frames/Episode | ~788 | Average trajectory length |
| Episode Duration | ~26 seconds | At 30 FPS |
| Robot Type | so101_follower | Single-arm 6 DOF |
| Task | Condiment handover | Primary objective |
| Format | LeRobot v3.0 | Parquet + MP4 videos (AV1 codec) |
| Parameter | Value | Justification |
|---|---|---|
| Total Training Steps | 2,000 | Full training cycle |
| Number of Epochs | ~32 | Effective epochs (31,522 frames ÷ 512 batch) |
| Checkpoints Saved | 5 | Steps: 400, 800, 1200, 1600, 2000 |
| Learning Rate | 1e-4 | GROOT recommended value |
| Weight Decay | 1e-5 | L2 regularization |
| Gradient Clip Norm | 1.0 | Training stability |
| Warmup Ratio | 0.05 | Gradual learning rate ramp |
| Batch Size (per GPU) | 128 | Maximum VRAM utilization |
| Effective Batch Size | 512 | 128 × 4 GPUs |
| Num Workers | 16 | DataLoader parallel loading |
| Video Backend | torchcodec | AV1 codec decoder |
| Mixed Precision | bf16 | Memory efficient training |
| Component | Specification | Utilization |
|---|---|---|
| GPUs | 4× NVIDIA H100 PCIe | All 4 GPUs used |
| VRAM per GPU | 80GB | ~79.65GB usable |
| Total VRAM | 320GB | Peak usage: ~60-70GB per GPU |
| CPUs | 124 AMD EPYC 9554 (64-Core) | Data loading |
| System RAM | 708GB | Adequate for data loading |
| Storage | 1.5TB ephemeral | Checkpoint storage |
from lerobot import Policy
policy = Policy.from_pretrained("asgard-robot/groot-condiment-handover")
# The model expects observations with:
# - observation.images.wrist1: RGB camera (640×480×3)
# - observation.images.realsense: RGB camera (640×480×3)
# - observation.state: 6D joint positions
action = policy(observation)
# Returns: 6D action space (joint positions + gripper)
The model outputs actions for 6 degrees of freedom:
shoulder_pan.posshoulder_lift.poselbow_flex.poswrist_flex.poswrist_roll.posgripper.pos@software{groot_condiment_model_2024,
author = {ASGARD Team},
title = {GROOT Condiment Handover Model - Step 2000},
model = {asgard-robot/groot-condiment-handover},
year = {2024},
month = {October},
checkpoint = {2000},
base_model = {nvidia/GR00T-N1.5-3B},
dataset = {asgard-robot/asgard_training_data_condiment},
training_hardware = {4× NVIDIA H100 PCIe GPUs}
}
Base model
nvidia/GR00T-N1.5-3B