Posts tagged #moe
-
Dense Activation-Fit Recovery: Healing Quantized Layers
How to recover dense performance from quantized layers using activation-fit artifacts and the recovery scripts in lean-mining.
-
The Impossibility of Key-Only Routing: An Architectural Boundary
Formalizing why MoE routers must depend on residual-stream state rather than key-only summaries.
-
The β-lift and FFN Transfer: MoE Compression Part E
Why β transfer in FFNs matters for quantization and the formal 'structure bonus' theorems in MoE compression.