Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
In a Training Loop 🔄
38.7
TFLOPS
13
14
94
fahrizalfarid
akahana
Follow
kargaranamir's profile picture
agentlans's profile picture
perorina's profile picture
12 followers
·
50 following
fahrizalfarid
fahrizalfarid
AI & ML interests
NLP
Recent Activity
liked
a dataset
24 days ago
lingbow/tiktok-video-engagement-1m
reacted
to
SeaWolf-AI
's
post
with 🔥
3 months ago
🏟️ Smol AI WorldCup: A 4B Model Just Beat 8B — Here's the Data We evaluated 18 small language models from 12 makers on 125 questions across 7 languages. The results challenge the assumption that bigger is always better. Community Article: https://huggingface.co/blog/FINAL-Bench/smol-worldcup Live Leaderboard: https://huggingface.co/spaces/ginigen-ai/smol-worldcup Dataset: https://huggingface.co/datasets/ginigen-ai/smol-worldcup What we found: → Gemma-3n-E4B (4B, 2GB RAM) outscores Qwen3-8B (8B, 5.5GB). Doubling parameters gained only 0.4 points. RAM cost: 2.75x more. → GPT-OSS-20B fits in 1.5GB yet matches Champions-league dense models requiring 8.5GB. MoE architecture is the edge AI game-changer. → Thinking models hurt structured output. DeepSeek-R1-7B scores 8.7 points below same-size Qwen3-8B and runs 2.7x slower. → A 1.3B model fabricates confident fake content 80% of the time when prompted with nonexistent entities. Qwen3 family hits 100% trap detection across all sizes. → Qwen3-1.7B (1.2GB) outscores Mistral-7B, Llama-3.1-8B, and DeepSeek-R1-14B. Latest architecture at 1.7B beats older architecture at 14B. What makes this benchmark different? Most benchmarks ask "how smart?" — we measure five axes simultaneously: Size, Honesty, Intelligence, Fast, Thrift (SHIFT). Our ranking metric WCS = sqrt(SHIFT x PIR_norm) rewards models that are both high-quality AND efficient. Smart but massive? Low rank. Tiny but poor? Also low. Top 5 by WCS: 1. GPT-OSS-20B — WCS 82.6 — 1.5GB — Raspberry Pi tier 2. Gemma-3n-E4B — WCS 81.8 — 2.0GB — Smartphone tier 3. Llama-4-Scout — WCS 79.3 — 240 tok/s — Fastest model 4. Qwen3-4B — WCS 76.6 — 2.8GB — Smartphone tier 5. Qwen3-1.7B — WCS 76.1 — 1.2GB — IoT tier Built in collaboration with the FINAL Bench research team. Interoperable with ALL Bench Leaderboard for full small-to-large model comparison. Dataset is open under Apache 2.0 (125 questions, 7 languages). We welcome new model submissions.
updated
a dataset
3 months ago
akahana/wikipedia-id-conv
View all activity
Organizations
None yet
akahana
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
24 days ago
lingbow/tiktok-video-engagement-1m
Viewer
•
Updated
30 days ago
•
29.1M
•
320
•
4
liked
a dataset
5 months ago
slone/nllb-200-10M-sample
Viewer
•
Updated
Nov 20, 2023
•
9.98M
•
97
•
13
liked
2 models
6 months ago
facebook/omniASR-W2V-1B
Automatic Speech Recognition
•
Updated
Nov 27, 2025
•
6
azale-ai/Starstreak-7b-beta
Text Generation
•
7B
•
Updated
Nov 19, 2023
•
13
•
•
5
liked
a dataset
6 months ago
jakartaresearch/cerpen-corpus
Viewer
•
Updated
Nov 28, 2022
•
50
•
21
•
3
liked
a model
6 months ago
unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF
Text Generation
•
71B
•
Updated
May 10, 2025
•
25k
•
114
liked
3 datasets
7 months ago
Sultannn/id_recipe
Viewer
•
Updated
Sep 18, 2022
•
15.6k
•
125
•
2
omarkamali/wikipedia-monthly
Viewer
•
Updated
Mar 14
•
195M
•
11.7k
•
69
agufsamudra/tts-indo
Viewer
•
Updated
May 18, 2025
•
114k
•
2.78k
•
8
liked
2 models
7 months ago
Qwen/Qwen3-Embedding-8B-GGUF
8B
•
Updated
Jul 15, 2025
•
20.8k
•
123
unsloth/gpt-oss-120b-unsloth-bnb-4bit
Text Generation
•
117B
•
Updated
Aug 8, 2025
•
4.69k
•
15
liked
a dataset
about 1 year ago
tokyotech-llm/swallow-code
Viewer
•
Updated
Mar 1
•
129M
•
1.18k
•
65
liked
a model
about 1 year ago
reducto/RolmOCR
Image-Text-to-Text
•
8B
•
Updated
Apr 2, 2025
•
243k
•
586
liked
5 datasets
about 1 year ago
allenai/olmOCR-mix-0225
Viewer
•
Updated
Feb 25, 2025
•
259k
•
559
•
170
ronnieaban/sunnah
Viewer
•
Updated
Jan 13, 2025
•
38.1k
•
72
•
2
FreedomIntelligence/medical-o1-reasoning-SFT
Viewer
•
Updated
Apr 22, 2025
•
90.1k
•
6.41k
•
1.11k
nvidia/Llama-Nemotron-Post-Training-Dataset
Viewer
•
Updated
May 8, 2025
•
3.91M
•
4.84k
•
669
Congliu/Chinese-DeepSeek-R1-Distill-data-110k
Viewer
•
Updated
Feb 21, 2025
•
110k
•
874
•
757
liked
2 datasets
over 1 year ago
hermanshid/doctor-id-qa
Viewer
•
Updated
Jul 9, 2023
•
6.33k
•
82
•
6
biznetgio/indonesia-law-qa-embeddings
Viewer
•
Updated
Aug 6, 2024
•
7.17k
•
34
•
5
Load more