moonshotai/Kimi-K2.6 Image-Text-to-Text • 1.1T • Updated about 13 hours ago • 591k • • 1.16k
moonshotai/Kimi-K2.5 Image-Text-to-Text • 1.1T • Updated about 12 hours ago • 4.39M • • 2.77k
Running on Zero Agents Featured 199 Chat with Kimi-VL-A3B-Thinking-2506 🤔 199 Chat with Kimi-VL: respond to text, images, video, PDFs
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated Mar 2 • 79
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems Paper • 2401.03945 • Published Jan 8, 2024
SpeechAlign: Aligning Speech Generation to Human Preferences Paper • 2404.05600 • Published Apr 8, 2024 • 1
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model Paper • 2408.02503 • Published Aug 5, 2024
Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models Paper • 2411.09691 • Published Nov 14, 2024
QCRD: Quality-guided Contrastive Rationale Distillation for Large Language Models Paper • 2405.13014 • Published May 14, 2024
LEGO:Language Enhanced Multi-modal Grounding Model Paper • 2401.06071 • Published Jan 11, 2024 • 12