YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ARGaze Checkpoints

Model

ARGaze is an online egocentric gaze estimation model with a DINOv3 ViT-S/16 visual encoder and an autoregressive heatmap decoder. The released final model is registered as:

DINOv3_TwoCrossHeatmapBiasEfficientARHeatmapGaze

The decoder separates history-biased memory retrieval from current-frame grounding through two cross-attention blocks.

Intended Use

The checkpoints are intended for research reproduction on EGTEA Gaze+, Ego4D gaze, and EgoExo4D egocentric gaze benchmarks. They are not intended for safety-critical eye tracking, biometric identification, or deployment without dataset- and domain-specific validation.

Training Data

The checkpoints were trained on the dataset-specific training splits described in the paper. This repository does not redistribute source videos or frames. Users must obtain datasets from their official sources.

Checkpoint Files

See checkpoints/manifest.json for expected Hugging Face paths, SHA-256 hashes, and local destination paths.

Limitations

  • Performance depends on matching preprocessing, frame extraction, and split definitions.
  • EgoExo4D uses benchmark split metadata included in this release; verify that the split file matches the paper version before reporting final numbers.
  • The DINOv3 encoder is loaded from Hugging Face at runtime.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support