Posts tagged #software-engineering
-
Butterflow: Pinning Agent Behavior with a Spec DSL
Agent evals that actually catch regressions: a Python flow/expect DSL for deterministic assertions, Arize Phoenix for fuzzy semantic evals, and cache-cluster grouping for token savings.