AutoCatalogAI CLIP Multi-Task Classifier

AutoCatalogAI is a fashion product attribute extraction model.

It predicts:

['gender', 'masterCategory', 'subCategory', 'articleType', 'baseColour', 'season', 'usage']

Dataset

Dataset: ashraq/fashion-product-images-small

Split:

  • Train: 70%
  • Validation: 15%
  • Test: 15%

Base Model

openai/clip-vit-base-patch32

Architecture

CLIP image encoder + multiple classification heads.

Overall Test Metrics

{
  "average_accuracy": 0.8335026038852995,
  "average_macro_f1": 0.6568447666058456,
  "average_weighted_f1": 0.8421526081109854,
  "average_top3_accuracy": 0.9711087581303888,
  "exact_match_accuracy": 0.27938284677053393,
  "avg_inference_time_ms_per_image": 1.5960319117829298,
  "test_samples": 6611
}

Important

This model should be loaded with the AutoCatalogAI project code because it contains custom multi-task classifier heads.

Expected files:

  • model.pt
  • config.json
  • label_maps.json
  • metrics.json
  • README.md
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support