← All tags

Posts tagged #quantization

Dense Activation-Fit Recovery: Healing Quantized Layers

May 16, 2026

How to recover dense performance from quantized layers using activation-fit artifacts and the recovery scripts in lean-mining.

#research #quantization #moe #deep-learning
The β-lift and FFN Transfer: MoE Compression Part E

Apr 29, 2026

Why β transfer in FFNs matters for quantization and the formal 'structure bonus' theorems in MoE compression.

#research #moe #quantization #lean
How to Honestly Test if a Neural Network Can Be Compressed

Apr 28, 2026

Pre-registration, trap cells, τ-hardened baselines, and kill-fast protocols: a field methodology for compression research that tries to kill its own ideas. With actual results from OLMoE-1B-7B.

#ml-research #compression #methodology #quantization #MoE
A Catalogue of Symmetries Compression Must Respect

Apr 26, 2026

Compression schemes regularly violate algebraic invariants of weight structure—producing models that pass perplexity checks but fail downstream. Here are the five core symmetry types a formally verified survey is collecting.

#ml-research #compression #symmetry #quantization #RoPE
Part E Pivot: FFN Rotation and the Narrow-d Falsification

Apr 25, 2026

After the KV-cache gauge, the obvious next move was applying β-lift to FFN weights. We tested it. It failed. Here is what the RAdam convergence probe and the 1-bit generation test actually showed.

#ml-research #quantization #MoE #compression #MLP
Phase-Collapse Defragmentation: Why MoE KV-Cache Resists 1-bit Quantization

Apr 23, 2026

Attention head activations in Mixture-of-Experts models cluster around expert routing patterns. Quantizing the KV-cache destroys this signal. The MoEGauge framework builds provable bounds on exactly how much.

#ml-research #quantization #MoE #compression #KV-cache