AutoCatalogAI CLIP Multi-Task Classifier

AutoCatalogAI is a fashion product attribute extraction model.

It predicts:

['gender', 'masterCategory', 'subCategory', 'articleType', 'baseColour', 'season', 'usage']

Dataset

Dataset: ashraq/fashion-product-images-small

Split:

Train: 70%
Validation: 15%
Test: 15%

Base Model

openai/clip-vit-base-patch32

Architecture

CLIP image encoder + multiple classification heads.

Overall Test Metrics

{
  "average_accuracy": 0.8335026038852995,
  "average_macro_f1": 0.6568447666058456,
  "average_weighted_f1": 0.8421526081109854,
  "average_top3_accuracy": 0.9711087581303888,
  "exact_match_accuracy": 0.27938284677053393,
  "avg_inference_time_ms_per_image": 1.5960319117829298,
  "test_samples": 6611
}

Important

This model should be loaded with the AutoCatalogAI project code because it contains custom multi-task classifier heads.

Expected files:

model.pt
config.json
label_maps.json
metrics.json
README.md

Downloads last month: -