๐Ÿค– Help Classifier Model (v2)

๐Ÿง  Overview

The Help Classifier Model (v2) is a fine-tuned NLP model designed to classify student help requests into meaningful categories within a collaborative learning environment.

This model is part of a larger AI system built for the Coding in Color (CIC) ecosystem, supporting students working across domains such as AI development, game development, 2D/3D art, and robotics.

Its primary purpose is to:

  • Interpret real student messages
  • Identify intent behind help requests
  • Route inputs to appropriate downstream systems (e.g., generators, agents)

๐Ÿš€ Version Update (v1 โ†’ v2)

๐Ÿ”น v1

  • Trained on ~100 examples
  • Limited generalization
  • Struggled with messy or informal input

๐Ÿ”น v2 (Current)

  • Trained on 1,000 examples

  • Balanced dataset across all categories

  • Strong performance on:

    • informal/slang input
    • mixed tone messages
    • ambiguous phrasing
    • real CIC-style check-ins

๐Ÿ‘‰ v2 significantly improves accuracy, stability, and real-world usability


๐Ÿงฉ Task Definition

Task Type: Text Classification

Input: Student message Output: One of 5 help categories


๐Ÿท๏ธ Labels

Label Description
learning_help User is trying to understand a concept or skill
project_help User needs direction or next steps in a project
technical_issue Something is broken or not working
attendance_issue User missed a meeting or needs to catch up
general_guidance User expresses uncertainty, stress, or needs advice

๐Ÿ—๏ธ Model Architecture

  • Base Model: distilbert-base-uncased
  • Fine-tuned for sequence classification
  • Number of labels: 5

โš™๏ธ Training Configuration

  • Epochs: 4
  • Learning Rate: 2e-5
  • Batch Size: 8
  • Weight Decay: 0.01
  • Train/Validation Split: 80/10/10

๐Ÿ“Š Training Results

Epoch Training Loss Validation Loss
1 0.552 0.512
2 0.111 0.122
3 0.032 0.077
4 0.025 0.064

๐Ÿ“ˆ Performance Summary

  • Low validation loss (~0.06)

  • Strong generalization across unseen inputs

  • Stable convergence during training

  • Handles:

    • messy/slang text
    • indirect requests
    • multi-layered inputs

๐Ÿงช Example Predictions

Input:

i missed the meeting and now idk what weโ€™re doing

Output:

attendance_issue

Input:

my model works but the predictions are weird and I donโ€™t know why

Output:

technical_issue

Input:

I feel like Iโ€™m behind and donโ€™t know what to focus on

Output:

general_guidance

๐Ÿ”— System Integration

This model is integrated into an MCP (Model Context Protocol) system where it acts as:

Entry-point classifier for routing student inputs

Pipeline example:

User Input โ†’ Help Classifier โ†’ (Future: Generator / Summarizer)

๐ŸŽฏ Use Cases

  • Help request classification
  • Slack/Discord message routing
  • Educational AI assistants
  • CIC ecosystem tools
  • AI agent pipelines

โš ๏ธ Limitations

  • Single-label classification (some messages may contain multiple intents)
  • Edge cases may still overlap between categories
  • Domain-specific (focused on student tech environments)

๐Ÿ”ฎ Future Improvements

  • Multi-label classification
  • Larger dataset (2,000+ examples)
  • Confidence scoring
  • Integration with response generation models
  • Continuous retraining with real user data

๐Ÿ‘ค Author

Created by Kingston Lewis as part of the Coding in Color program for the AI Dev team.


help-classifier-v2

This model is a fine-tuned version of distilbert-base-uncased on the King-8/help-request-messages-v2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0643

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.5524 1.0 88 0.5124
0.1114 2.0 176 0.1221
0.0324 3.0 264 0.0771
0.0249 4.0 352 0.0643

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cpu
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
5
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for King-8/help-classifier-v2

Finetuned
(11659)
this model

Dataset used to train King-8/help-classifier-v2

Space using King-8/help-classifier-v2 1