Instructions to use ZJU-AI4H/DentVLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ZJU-AI4H/DentVLM with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="ZJU-AI4H/DentVLM") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("ZJU-AI4H/DentVLM") model = AutoModelForMultimodalLM.from_pretrained("ZJU-AI4H/DentVLM") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ZJU-AI4H/DentVLM with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ZJU-AI4H/DentVLM" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ZJU-AI4H/DentVLM", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/ZJU-AI4H/DentVLM
- SGLang
How to use ZJU-AI4H/DentVLM with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ZJU-AI4H/DentVLM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ZJU-AI4H/DentVLM", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ZJU-AI4H/DentVLM" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ZJU-AI4H/DentVLM", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use ZJU-AI4H/DentVLM with Docker Model Runner:
docker model run hf.co/ZJU-AI4H/DentVLM
DentVLM: A Multimodal Vision-Language Model for Comprehensive Dental Diagnosis and Enhanced Clinical Practice
DentVLM is a multimodal vision-language model for dental image understanding and diagnosis-oriented question answering. It supports dental image-question inputs for research tasks including malocclusion recognition, dental disease recognition, and region-aware dental image analysis. The model is released as a research artifact to support reproducibility and further academic research.
Model Access
The DentVLM model weights are publicly available from this Hugging Face model repository under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Please cite the associated manuscript and this repository when using DentVLM.
Source Code and Reproducibility
The source code, training scripts, inference scripts, evaluation scripts, and example data format are available at:
https://github.com/ZJUI-AI4H/DentVLM
The GitHub repository includes instructions for environment setup, model loading, inference, and evaluation. Users should refer to the repository documentation for the exact software versions and command-line examples used in the associated study.
Intended Use
DentVLM is intended for:
- Academic and non-commercial research on dental multimodal vision-language modeling
- Dental image understanding and question-answering research
- Reproduction and extension of the DentVLM training, inference, and evaluation pipeline
- Benchmarking on dental multimodal tasks
- Development of research workflows for dental AI evaluation
Not Intended For
DentVLM is not intended or approved for:
- Use as the sole basis for clinical diagnosis, treatment planning, triage, or patient management
- Emergency medical or dental decision-making
- Autonomous or automated clinical decision-making without appropriate validation, professional oversight, and regulatory approval
- Commercial use of the released model weights without separate permission from the rights holder
- Unlawful, harmful, privacy-invasive, or unethical applications
Limitations
- DentVLM is developed as a research model for dental image understanding and diagnosis-oriented question answering.
- Model outputs should be interpreted in the context of professional expertise and task-specific evaluation.
- Model performance may vary with image quality, imaging modality, acquisition conditions, patient population, annotation standards, and prompt formulation.
- Performance in new clinical environments, imaging protocols, or patient populations may differ from the results reported in the associated study.
- Users are responsible for conducting appropriate validation before any downstream research or translational use.
Ethical Considerations
Users should ensure that all dental images and associated data are collected, processed, stored, and used in compliance with applicable privacy, consent, institutional review, and data protection requirements.
Users should not use DentVLM to attempt to identify, re-identify, or infer sensitive information about any individual. The model should not be used for automated clinical decision-making without appropriate validation, professional oversight, and regulatory approval.
License
DentVLM model weights
The DentVLM fine-tuned model weights and DentVLM-specific model release materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0), unless otherwise stated.
Under CC BY-NC 4.0, the released model weights may be used, shared, and adapted for non-commercial purposes, provided that appropriate attribution is given and license terms are followed.
Source code
The DentVLM source code, including training, inference, and evaluation scripts, is released in the GitHub repository under the Apache License 2.0, unless otherwise stated in individual files.
Upstream components
DentVLM is built on Qwen/Qwen2-VL-7B-Instruct. Qwen2-VL-7B-Instruct is released by Alibaba Cloud under the Apache License 2.0. Third-party components, including Qwen2-VL, LLaMA-Factory, vLLM, and their associated files, remain subject to their original licenses and notices.
This model release does not grant rights to use any third-party trademarks or protected clinical data.
Citation
If you use DentVLM, please cite the associated manuscript and this model repository:
@article{meng2025dentvlm,
title={Dentvlm: A multimodal vision-language model for comprehensive dental diagnosis and enhanced clinical practice},
author={Meng, Zijie and Hao, Jin and Dai, Xiwei and Feng, Yang and Liu, Jiaxiang and Feng, Bin and Wu, Huikai and Gai, Xiaotang and Zhu, Hengchuan and Hu, Tianxiang and others},
journal={arXiv preprint arXiv:2509.23344},
year={2025}
}
- Downloads last month
- -