Posts tagged #KV-cache

Phase-Collapse Defragmentation: Why MoE KV-Cache Resists 1-bit Quantization

Apr 23, 2026

Attention head activations in Mixture-of-Experts models cluster around expert routing patterns. Quantizing the KV-cache destroys this signal. The MoEGauge framework builds provable bounds on exactly how much.

#ml-research #quantization #MoE #compression #KV-cache