biglam/on_the_books
Viewer • Updated • 1.79k • 177 • 2
This model is a distilbert-base-uncased sequence classifier fine-tuned on biglam/on_the_books to identify statutory sections labeled as Jim Crow laws.
Input format used for training concatenates metadata, chapter text, and section text. Labels are:
0: no_jim_crow1: jim_crowHeld-out stratified validation split: 20% of the dataset, seed 55.
{
"epoch": 5.0,
"eval_accuracy": 0.9859943977591037,
"eval_f1_jim_crow": 0.975609756097561,
"eval_loss": 0.09546805918216705,
"eval_macro_f1": 0.9828932866931812,
"eval_precision_jim_crow": 0.970873786407767,
"eval_recall_jim_crow": 0.9803921568627451,
"eval_roc_auc": 0.9928681276432142,
"eval_runtime": 1.2417,
"eval_samples_per_second": 287.505,
"eval_steps_per_second": 9.664
}
The training script used class-weighted cross-entropy to account for label imbalance.
Base model
distilbert/distilbert-base-uncased