← Back to blog timeline

Infrastructure for Frontier Research

Systems for high-performance ML research: HF-streaming for large artifacts and the dual-emit data-driven paper pattern.

  1. HF-Streaming for Large Artifacts: Scaling ML Research

    Implementing HFStreamUploader to bypass local disk limits for >100MB artifacts using safetensors and io.BytesIO.

  2. The Dual-Emit Paper Pattern: Data-Driven Manuscripts

    How to build 'data-driven papers' that emit LaTeX macros and microsite JSON simultaneously to ensure 1:1 parity.

  3. Verified Security Gates: Safe ML Deserialization

    Using Lean 4 soundness proofs to gate Python type-stub deserialization in the Falcon-secure project.