Asymmetric Compliance Damage: The Cost of Isolation


One of the most surprising findings in the Phase 5 results wasn’t a success, but an Informative Failure. We call it Asymmetric Compliance Damage.

This post covers what happens when you apply a fixed π/8\pi/8 rotation to the role channel, and why it selectively “kills” the trusted instruction stream while leaving the untrusted data stream intact.


The π/8\pi/8 Arm: A Selective Collapse

In the Counterfactual V2 experiments, the vanilla model reached an instruction-compliance of 0.155. We expected the π/8\pi/8 arm (W&B: y0033rou) to either improve this or show a general utility loss across the board.

The Verdict: The damage was asymmetric.

  • INSTRUCTION slot: Compliance collapsed from 0.155 to 0.020.
  • DATA slot: Performance was essentially unchanged (0.290 to 0.295).

Intuition: The π/8\pi/8 rotation didn’t just “blur” the model. It selectively destroyed the model’s ability to adhere to the trusted role channel. The “provenance” rotation acted as a filter that only blocked the good signals.

The Cost-Law Framing

This updates our understanding of the “Angle Cost Law.” While the aggregate eval loss might look relatively smooth (+0.103 delta), the actual capability cost is focused.

The Claim: That a provenance mechanism should provide “free” isolation. Verdict: Falsified for post-projection fixed rotations. At this scale (SmolLM2-135M), the model cannot “un-rotate” the signal well enough to maintain instruction compliance.


Why this isn’t a “Negative Result”

In our falsification methodology, a failure that explains why it failed is more valuable than a lucky success. Asymmetric Compliance Damage tells us that post-projection rotations are fundamentally too disruptive for small-scale instruction followers.

It points directly to the solution: Pre-W placement, which we’ll cover in the Phase 5 Audit.

Next in this series: The Pi/8 Instruction Output Audit: Phase 5 Benchmarks