Video Classification
Transformers
PyTorch
Safetensors
English
xclip
feature-extraction
vision
Eval Results (legacy)
Instructions to use microsoft/xclip-large-patch14 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/xclip-large-patch14 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("video-classification", model="microsoft/xclip-large-patch14")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("microsoft/xclip-large-patch14") model = AutoModelForMultimodalLM.from_pretrained("microsoft/xclip-large-patch14") - Notebooks
- Google Colab
- Kaggle
Update README.md
#3
by sanderjmelo - opened
README.md
CHANGED
|
@@ -1,5 +1,4 @@
|
|
| 1 |
---
|
| 2 |
-
language: en
|
| 3 |
license: mit
|
| 4 |
tags:
|
| 5 |
- vision
|
|
@@ -58,4 +57,4 @@ During validation, one resizes the shorter edge of each frame, after which cente
|
|
| 58 |
|
| 59 |
## Evaluation results
|
| 60 |
|
| 61 |
-
This model achieves a top-1 accuracy of 87.1% and a top-5 accuracy of 97.6%.
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
license: mit
|
| 3 |
tags:
|
| 4 |
- vision
|
|
|
|
| 57 |
|
| 58 |
## Evaluation results
|
| 59 |
|
| 60 |
+
This model achieves a top-1 accuracy of 87.1% and a top-5 accuracy of 97.6%.
|