Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection
Paper • 2206.11250 • Published
Pre-trained weights for the model introduced in:
Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection
Jiaying Lin*, Yuen-Hei Yeung*, Shuquan Ye, Rynson W. H. Lau
AAAI 2025
arXiv · Project Page · Dataset (RGBD-GSD)
RGBD-GSD-Net detects glass surfaces by jointly processing RGB images and depth maps. It introduces two novel modules:
The backbone is a ResNeXt encoder shared across both modalities.
| File | Description |
|---|---|
best.pth |
Best checkpoint (204 MB), saved as {'model': state_dict, ...} |
results/our_best_results.zip |
Model predictions on the RGBD-GSD test set |
import torch
from networks.your_network import RGBDGlassNet # from the code release
model = RGBDGlassNet()
checkpoint = torch.load("best.pth", map_location="cpu")
model.load_state_dict(checkpoint["model"])
model.eval()
Download the checkpoint:
huggingface-cli download garrying/RGBD-GSD-Net best.pth --local-dir ./weights
This model was trained and evaluated on RGBD-GSD, the first large-scale RGB-D glass surface detection dataset:
@article{aaai2025_rgbdglass,
author = {Lin, Jiaying and Yeung, Yuen-Hei and Ye, Shuquan and Lau, Rynson W.H.},
title = {Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection},
journal = {AAAI},
year = {2025},
}
Non-commercial use only — CC BY-NC 4.0.