Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets Paper • 2604.22294 • Published Apr 24 • 18
DavidAU/Qwen3.6-27B-NEO-CODE-Di-IMatrix-MAX-GGUF Image-Text-to-Text • 27B • Updated 15 days ago • 23.9k • 58
Running Featured 224 Gemma 4 WebGPU 🚀 224 Run Gemma 4 locally in-browser on WebGPU w/ Transformers.js
KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation Paper • 2604.08455 • Published Apr 9 • 47
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory Paper • 2410.10813 • Published Oct 14, 2024 • 16
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance tngtech • Apr 16, 2025 • 79