ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning Paper • 2606.03503 • Published 9 days ago • 25
meta-llama/Llama-3.2-1B-Instruct Text Generation • 1B • Updated Oct 24, 2024 • 7.4M • • 1.47k
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency not-lain • Jan 30, 2025 • 344