Text Classification
Transformers
Safetensors
English
deberta-v2
disinformation-detection
deberta
binary-classification
Eval Results (legacy)
text-embeddings-inference
Instructions to use pjait/deberta-disinfo-detection with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use pjait/deberta-disinfo-detection with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="pjait/deberta-disinfo-detection")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("pjait/deberta-disinfo-detection") model = AutoModelForSequenceClassification.from_pretrained("pjait/deberta-disinfo-detection") - Notebooks
- Google Colab
- Kaggle
DeBERTa-v3-base for Disinformation Detection (Binary Classification)
This model is a fine-tuned version of microsoft/deberta-v3-base for binary disinformation detection. It classifies news articles as either credible (0) or disinformation (1).
Model Details
- Base model: microsoft/deberta-v3-base
- Task: Binary text classification (credible vs. disinformation)
- Language: English
- Training framework: PyTorch + Transformers
Training Hyperparameters
| Parameter | Value |
|---|---|
| Learning rate | 1e-05 |
| Batch size (train) | 16 |
| Batch size (eval) | 16 |
| Epochs | 5 |
| Weight decay | 0.01 |
| Warmup ratio | 0.06 |
| FP16 | True |
| Max sequence length | 512 |
| Seed | 42 |
| Eval steps | 100 |
| Best model selection | binary_f1_pos |
Evaluation Results
Overall (Test Set)
| Metric | Value |
|---|---|
| Binary F1 (positive) | 0.9041 |
| Macro F1 | 0.9342 |
| Accuracy | 0.948 |
| AUC-ROC | 0.9864 |
| Precision | 0.9223 |
| Recall | 0.9485 |
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("pjait/deberta-v3-base-disinfo-task1-binary")
model = AutoModelForSequenceClassification.from_pretrained("pjait/deberta-v3-base-disinfo-task1-binary")
text = "Your article text here..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
probability = torch.sigmoid(logits).item()
prediction = "disinformation" if probability >= 0.5 else "credible"
print(f"Prediction: {prediction} (probability: {probability:.4f})")
Training Data
The model was trained on the about 5k articles dataset for Task 1 (binary classification), which contains news articles annotated by multiple annotators for credibility assessment.
Limitations
- The model is trained on English-language articles only.
- Performance may vary on domains or topics not represented in the training data.
- The model should be used as a tool to assist human judgment, not as a sole decision-maker.
Citation
If you use this model, please cite the paper (to do, currently paper under review).
Authors
- Downloads last month
- 38
Evaluation results
- F1 (positive class)self-reported0.904
- Accuracyself-reported0.948
- AUC-ROCself-reported0.986