Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

pankajpandey-dev 
posted an update 2 days ago
view post
Post
7048
🇮🇳 New in my Hindi LLM Series: Gemma-4 E4B, fine-tuned for Hindi — and it runs on your laptop's CPU.
I fine-tuned Google's new Gemma-4 E4B on ~10k Hindi instruction pairs (AI4Bharat: anudesh + dolly) using Unsloth + LoRA, on a single L4 GPU.
Then I ran an honest side-by-side eval: base Gemma-4 vs my fine-tune, across 25 Hindi prompts. The results were interesting 👇
✅ My fine-tune is more concise — ask for "3 tips" and it gives exactly 3. Base writes a 1,200-character essay.

✅ Pure native Hindi — base keeps slipping into English ("संतुलित आहार (Eat a Balanced Diet)", "तारा (Star)"). My fine-tune stays in clean Hindi.

✅ Tighter instruction-following — ask for a "short message" and it gives one, not a menu of options.
⚖️ And to be honest: base Gemma-4 is more detailed and comprehensive. I didn't build a "smarter" model — I built a focused, Hindi-native, edge-friendly one that runs as a 5GB GGUF (Q4) on CPU.
🔗 Try it:

Live demo (CPU): pankajpandey-dev/gemma-4-e4b-hindi-demo
GGUF (Ollama/llama.cpp): pankajpandey-dev/gemma-4-e4b-hindi-instruct-GGUF
16-bit model: pankajpandey-dev/gemma-4-e4b-hindi-instruct

Built with @unsloth · Data by @ai4bharat 🙏
#Hindi #LLM #Gemma #Unsloth #IndicNLP #GGUF
  • 12 replies
·
AxionLab-official 
posted an update about 10 hours ago
view post
Post
1552
⚠️ Community Notice

We would like to clarify that SupraLabs has no affiliation, partnership, or connection whatsoever with "SupraLarps" or its members.

Please avoid interacting with their organization, repositories, or Spaces under the assumption that they are associated with us.

We are currently aware of the situation and have already contacted the appropriate channels to address it.

Thank you to everyone who continues to support SupraLabs. ❤️
  • 1 reply
·
projectlosangeles 
posted an update 1 day ago
view post
Post
3722
🔥If you love multi-modal art🎨, please check out "A Million Little Fibers" project!!!🔥

https://github.com/asigalov61/A-Million-Little-Fibers-2026

https://soundcloud.com/aleksandr-sigalov-61/sets/a-million-little-fibers

This brand new 2026 edition covers three SOTA models:

zai-org/GLM-5.2
k2-fsa/OmniVoice
HeartMuLa/HeartMuLa-oss-3B-happy-new-year

The project aims to showcase what kind of multi-modal art is now possible to create with these amazing OSS resources!

If you enjoyed the project, please ⭐or 🔱 GitHub repo and❤️on SoundCloud and Hugging Face. It really helps!

Most sincerely,

Alex

Project Los Angeles
Tegridy Code 2026

P.S. Don't forget to bring a towel! 😂

@multimodalart
@victor
@John6666
  • 3 replies
·
fffiloni 
posted an update 3 days ago
view post
Post
1263
A few weeks ago, @victor opened the door: coding agents can now ship Hugging Face Spaces autonomously.

I pulled on that thread.

As someone who builds and ships Gradio demos regularly, I didn’t just want to reproduce the loop. I wanted to see what happens when that loop is plugged into the whole Hugging Face stack.

The interesting part is not only that an agent can ship a Space.

It’s what happens when Space generation becomes a first-class Hugging Face workflow.

That became Agentic Space Factory.

More soon. 🤗
  • 1 reply
·
Banaxi-Tech 
posted an update 1 day ago
view post
Post
1314
Hello AI Community! 👋
We currently have a new AI Model and we are currently training it.
We are training it on 27B tokens and are currently 8% done.
Follow us to be notified when it releases 🚀
Some Info:
Parameters 75M
GPU: RTX Pro 6000
We expect to be able to release it in the coming days

EDIT: We are now at step 40000/274000 expect a Preview Model coming at about 100K
  • 7 replies
·
ST-x-Tony 
posted an update 3 days ago
view post
Post
7038
Hello AI Community! 👋

We are thrilled to announce the release of **NRS_QWEN_MYTHOS_1M**, a high-performance reasoning model built on the powerful **Qwen 3.5 9B** base. At **SKT AI LABS**, we’ve applied our proprietary **Neural Reasoning System (NRS)** to push the boundaries of what a 9B model can do.

🔥 **Why this model is a Game-Changer:**

✅ **100x High Reasoning Capacity:** Deep logical thinking and complex problem-solving via NRS Boosting.
✅ **1 Million Token Context:** Handle massive codebases, long documents, and multi-turn agentic tasks with ease (YaRN Scaling).
✅ **Advanced Thinking Mode:** Native tags for step-by-step Chain-of-Thought reasoning.
✅ **Tool-Use Ready:** Optimized for Python execution and Web Search with self-correction.
✅ **Blazing Fast:** Efficient 9B architecture that runs smoothly on consumer hardware (RTX 3090/4090).

🛠️ **Technical Highlights:**
* **Base:** Qwen 3.5 9B
* **Tuning:** NRS Specific Tuning high-quality samples.
* **License:** NRS DOCS
Whether you are a developer building coding agents, a researcher dealing with long-context data, or just someone who loves deep reasoning, this model is built for you.

👇 **Try it now on Hugging Face:**
SKT-NRS/NRS_QWEN_MYTHOS_1M
  • 1 reply
·
SeaWolf-AI 
posted an update 5 minutes ago
view post
Post
🐯 Chitos — The Security Scanner That Actually Proves It

Most security scanners hand you a suspect list and walk away. That gap between detection and proof is where attackers live — and it's exactly the gap that Chitos was built to close.

Chitos is the successor to Mythos, a static analyzer built for quick code health checks. Mythos was good at pattern matching — spotting dangerous sinks, mapping CWEs, producing readable reports. But static analysis has a structural ceiling. A rule that sees eval(user_input) can tell you that looks dangerous. It cannot tell you whether the input is reachable, whether sanitization three layers up covers this path, or whether there's a live exploit chain for your exact framework version. Chitos was built to answer those questions.

🔍 Phase 1 applies 50 language-agnostic rules across Python, JavaScript, Go, Java, C/C++, Rust, PHP, YAML and more — covering injection sinks, deserialization gadgets, credential leakage, broken crypto, and prototype pollution. Every candidate is re-verified before reaching the report. Findings that can't be substantiated are excluded, not handed to you as noise.

🔬 Phase 2 dispatches an autonomous web-search agent to hunt live CVE databases, exploit advisories, and public PoC repositories. It formulates hypotheses, verifies them, and synthesizes a structured threat narrative. This phase needs a user-supplied Claude API key — Phases 1 and 3 run entirely free.

🎯 Phase 3 is where Chitos diverges from everything else. Against targets you own or are authorized to test, it fires real payloads — XSS, SQLi, path traversal, command injection — mutates on block, captures hard evidence, and connects every proven finding into a kill-chain showing which vulnerabilities to remediate first.

No installation. No account. No code sent to third-party APIs.

Article: https://huggingface.co/blog/FINAL-Bench/chitos

Try it now 👉 https://chitos.vidraft.net
Bc-AI 
posted an update 3 days ago
view post
Post
142
# 🔥 Nova-1 Beta: Test Our New LLMs!

**Smilyai Labs** is building **Nova-1** — open-source LLMs with novel architectures. Join our beta program!

## 🎯 Available Now:

**Nova-1-Standard (1.2B)** — Phase 2 of pretraining in progress
- PPL 13.5 (beats GPT-2 Large!)
- 48K tok/s on consumer GPUs
- Great for code, reasoning, edge deployment

**Nova-1-Large (3.5B)** — Training live RIGHT NOW
- Current: 30.9 PPL, improving fast, loss at 3.5 right now
- Will finish with ~1.7B tokens today
- Better reasoning & longer context

**Nova-1-XL (10B MoE)** — Coming soon (We dont know yet! haha)
- Final Specs not decided yet


## What Makes Nova Special?

✨ **Mixture of Depths (MoD)** — Routes tokens dynamically, 30% faster
✨ **Grouped Query Attention** — Efficient like LLaMA 2/3
✨ **Phased Training** — Fresh 1B tokens each phase (no overfitting!)
✨ **RoPE** — Context extendable to 8K+

## 🤝 Join Beta Testing:

👉 **[Smilyai-labs-beta-testers](
Smilyai-labs-beta-testers


Get early access, shape the roadmap, and help build transparent open-source AI!

  • 1 reply
·
Passpass119 
posted an update 4 days ago
view post
Post
112
I am excited to announce that I have nothing to announce
mmhamdy 
posted an update 4 days ago
view post
Post
162
It has been more than a decade now since the knowledge distillation paper came out.

Knowledge Distillation (KD) is one of my favorite topics, but I have to confess that I'm not a huge fan of the term because I find it confusing (or at least, it has became so over time).

The idea behind KD is not novel; it was there almost a decade before the paper came out (and arguably even a decade before that, back to 1990-91). But this paper is the one that clicked, the one that made the topic much more popular and introduced it to a broader audience.

First, the timing and the authors played a big role: we have Geoffrey Hinton, Oriol Vinyals, and Jeff Dean here. And second, Geoffrey Hinton is really good at idea branding: Model compression?! No, no, no! Let's call it "Knowledge Distillation" and use evocative terms such as "Dark Knowledge" to describe what is being transferred.

It's a great name, but as time has passed, the term became a bit of a relic. KD is no longer solely about compression (KD used to be introduced as a method for model compression, but now model compression is just one application of KD). And the other thing is that the word "distillation" implies some sort of potency here, that the student is somehow more powerful than the teacher, which is not the case (but many counterarguments could be made, for example, more powerful compared to another model trained with no teacher)

Nevertheless, the paper is incredibly well-written, short, and fun to read. It's one of few papers that I read several times. Check it out, and maybe share your thoughts on the topic with us here!

If you had to choose another name for Knowledge Distillation, what would it be?

  • 2 replies
·