RadFinder

Links: Project pagePaperCodeModels

Disease-Aware Vision–Language Pretraining for 3D CT

We pretrain a 3D CT vision–language model on 159k report–volume pairs with two new supervision signals: prompt-based disease labels for classification and intra-scan snippet localization for axial depth grounding. A single unified model reaches state-of-the-art retrieval on CT-RATE, competitive disease classification, and slice-level localization at 12 mm resolution.

Usage

See the GitHub repository.

Training data

  • RefCT (internal): ~98k report–volume pairs from ~50k patients at a single hospital; in-house clinical data, not publicly released.
  • CT-RATE (CC BY-NC-SA 4.0)
  • Merlin (Stanford AIMI non-commercial research DUA)
  • INSPECT (Stanford AIMI non-commercial research DUA)

Further acknowledgements

  • The model and parts of the SigLIP training framework in src/radfinder are based on SPECTRE
  • The text processing pipeline in src/rate is used to create binary labels based on text reports and is based on RATE
  • We thank the MONAI, timm, and Hugging Face transformers maintainers for the libraries and all other package maintainers listed in requirements.txt
  • The demo scan under assets/demo/s0859/ is case s0859 from TotalSegmentator v2 (Wasserthal et al., CC-BY-4.0).
  • Funding, additional acknowledgements, full citations: see paper.

License

  • All code is MIT (see LICENSE) unless a file header says otherwise. Files in src/rate/ that carry a # Vendored from YalaLab/rate ... (ECL 2.0) header are derivatives of the upstream rate package and are licensed under ECL 2.0 (see LICENSE_RATE).
  • RadFinder model weights are CC BY-NC-SA 4.0, see LICENSE_MODELS.
    • Note: the weights are subject to the original dataset licenses. Users intending to use RadFinder in commercial settings should verify dataset and model licensing and obtain any required permissions.

Citation

If you use this code, models, or results, please cite:

@inproceedings{ging2026radfinder,
  author    = {Simon Ging and Philipp Arnold and Sebastian Walter and Hani Alnahas and Hannah Bast and Elmar Kotter and Jiancheng Yang and Behzad Bozorgtabar and Thomas Brox},
  title     = {Learning to Read Where to Look: Disease-Aware Vision--Language Pretraining for 3{D} {CT}},
  booktitle = {Medical Image Computing and Computer Assisted Intervention -- {MICCAI} 2026, Strasbourg, France, September 27 -- October 1, 2026, Proceedings},
  series    = {Lecture Notes in Computer Science},
  publisher = {Springer},
  year      = {2026},
  note      = {To appear},
}
Downloads last month
129
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lmb-freiburg/radfinder

Base model

cclaess/SPECTRE
Finetuned
(1)
this model

Dataset used to train lmb-freiburg/radfinder

Collection including lmb-freiburg/radfinder

Paper for lmb-freiburg/radfinder