ChristBERT/sciGNAD
Viewer • Updated • 2.28k • 7
How to use ChristBERT/sciGNAD_tcls with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="ChristBERT/sciGNAD_tcls") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("ChristBERT/sciGNAD_tcls", dtype="auto")Configuration Parsing Warning:Config file config.json cannot be fetched (too big)
ChristBERT/sciGNAD is a German binary text classification model for distinguishing scientific content from general-domain text.
The model is based on GeistBERT base and fine-tuned for domain filtering tasks, enabling reliable separation of relevant scientific content from non-scientific text.
The model was fine-tuned on a binary-labeled dataset derived from the 10kGNAD.
The model was evaluated on a manually labeled dataset:
| Metric | Score |
|---|---|
| F1 Score | 80.34% |
GeistBERT_base If you use this model, please cite:
@misc{christbert_scignad,
title={ChristBERT/sciGNAD: Scientific Content Classifier for German Web Data},
author={Schmitt, Raphael},
year={2026}
}
@misc{he2026wordwaystrategiesdomainspecific,
title={The Word and the Way: Strategies for Domain-Specific BERT Pre-Training in German Medical NLP},
author={Henry He and Johann Frei and Raphael Schmitt},
year={2026},
eprint={2606.03250},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2606.03250},
}
Base model
GottBERT/GottBERT_filtered_base_best