Salma Mayorquin's picture

Salma Mayorquin PRO

salma-remyx

·

https://remyx.ai

smellslikeml

AI & ML interests

None yet

Recent Activity

reacted to sergiopaniego's post with 🔥 about 4 hours ago

OpenEnv has a new home: github.com/huggingface/OpenEnv Starting today, it's coordinated by a committee that includes Meta-PyTorch, Reflection, Unsloth, Modal, Prime Intellect, Nvidia, Mercor, Fleet AI, and Hugging Face frontier labs train their models and their harnesses together. Claude knows Claude Code. GPT-5.5 knows Codex. that's not an accident, it's training. open-source models deserve the same magic, but pulling that off requires infrastructure that belongs to everyone, not one lab OpenEnv is that layer. one api, any harness, any trainer, any environment Rewards and training loops stay in TRL, Unsloth, wherever you already work. OpenEnv is the socket they all plug into Get involved! Full announcement: https://huggingface.co/blog/openenv-agentic-rl

reacted to pbhappliedsystems's post with 🔥 about 4 hours ago

🚀 **New flagship dataset — and an argument about what a dataset card should be.** Most synthetic datasets on the Hub ship row counts, a license, and little else — pipeline opaque, rejection criteria unstated, compliance unaudited. We published the opposite. **SynthEval Cloud — Regulated-Domain Synthetic Instruction Dataset** 👉 https://huggingface.co/datasets/pbhappliedsystems/syntheval-cloud-regulated-instruct-1k **1,116** quality-gated instruction records across **7 regulated domains** (medical, legal, GDPR, privacy, education, e-commerce, transport). Every record cleared a documented cascade, not a vibe check: - 🧪 **Dual-signal hallucination gate** — rejects only when embedding cosine *and* keyword-overlap both fail; a low score alone never rejects. - 🔒 **Layered PII masking + independent leak audit** — a separate over-reporting scanner found **0.0% residual leak** across all 1,116 records. - 📊 **Whole-corpus evaluation, not a sample** — MATTR **0.769**, mean cosine **0.73**, **0%** near-duplicates, **96.9%** yield. - 🧾 **The 36 rejections ship too**, each tagged with its failing gate. Removal at the gate is the product; we show our work. Every number on the card is a field in the `evaluation_report.json` shipped beside the data — full methodology + provenance (Mistral-Nemo AWQ W4A16 · vLLM 0.8.5.post1 · Modal A10G). One release from **SynthEval**: Studio (local GPU) + Cloud (Modal+vLLM), proving quality parity across substrates. 📄 Whitepaper: https://pbhappliedsystems.com/SynthEval_Studio_and_Cloud_Quality-Gated_Synthetic_Data_Generation.pdf 🔎 Overview: https://pbhappliedsystems.com/synthetic-data.html **CC BY 4.0** — commercial use welcome, just credit it. Need defensible synthetic data at scale? Let's talk. — Patrick Hill, PBH Applied Systems

liked a model 3 days ago

remyxai/dockergen-0.5b

View all activity

Organizations

salma-remyx 's collections 2