Debanjan Basu

Research Engineer / ML Systems Engineer · LLM Infrastructure · Observability

Berlin, Germany

ML systems engineer with independent research bridging post-training methods, formal verification, and empirical methodology for safety-relevant claims. LoRA-DPO scaling work on Pythia 70M–1B documents geometry–behaviour decoupling: γ-overlap doesn't predict reward margin, suggesting behavioral interventions don't necessarily reorganize underlying representations — a structural finding with implications for evaluating alignment techniques. Lean 4 invariants formalize the empirical scaling (74 theorems across two papers, all in Mathlib).

Six years production ML engineering at Nexern: LLM agent observability with Arize Phoenix, distributed pipelines, deployment infrastructure. Physics background (IISER Kolkata BS-MS; doctoral research under Peter Blöchl at TU Clausthal). Pre-registered adversarial validation methodology — trap-cell design, kill-fast sequential testing — formalized as the Compression Falsification Ladder.

Explore the Hubs

Start here

Paper artifact

Phase-Collapse Defragmentation

Moment-ratio bounds, MoE KV-cache quantization, and zero-sorry Lean source.

Methodology

The Compression Falsification Ladder

Pre-registration, τ-baselines, trap cells, and kill-fast empirical design.

Negative results

Adversarial Passes That Retracted Two Claims

What survived when the LoRA-DPO geometry story was forced to attack itself.

Curated Personal Writings

How to Honestly Test if a Neural Network Can Be Compressed

Pre-registration, trap cells, τ-hardened baselines, and kill-fast protocols: a field methodology for compression research that tries to kill its own ideas. With actual results from OLMoE-1B-7B.

Pre-Registration for Solo ML Researchers

How to borrow the clinical trial discipline of writing down what "pass" looks like before running the experiment — and why a SHA256 hash is the cheapest honesty enforcement mechanism available.

The Words That Feel Most Bengali

The words that feel irreducibly Bengali — the ones with no Sanskrit explanation — are probably the oldest non-Bengali words in the language.

Before We Were Bengali — The People of the Red Earth

3,600 years ago, a rice-farming civilization on the Ajay River left words, bones, and pottery that are still with us. Who were they?

The Words That Don't Move

Languages change unevenly. The words that resist replacement the longest are the ones lived in the body — farming, cooking, counting. Bengali is a palimpsest, and knowing which layer you're reading changes everything.

Three Ways of Knowing the Same People

There is a community of 7 million people who remember, in song, walking through a mountain pass to a golden land they then had to abandon overnight. Genetics corroborates the journey. Sanskrit texts called their kings demons.

The Iron That Cut Their Own Forest

Before any Sanskrit speaker had reached the eastern hills, someone there figured out how to smelt iron. Their descendants became the blacksmiths every classical Indian village needed. The iron they pioneered cleared the forests that had sheltered them.

Reform Pathways and the Automation Question: What Berlin Could Do Tomorrow

Federal operations funding, employer levies, congestion charging, and driverless trains — proven solutions exist for every aspect of BVG's crisis. The question is political will.

A Word Older Than the Language That Carried It

Saat Bhai Champa — Seven Brothers Champa. The most Sanskrit-sounding element of the story is probably the oldest non-Sanskrit thing in it.

Get in Touch

I'm always interested in conversations about AI systems, physics-informed machine learning, or elegant solutions to hard problems.

debanjan.basu.ds@gmail.com

Read the Blog CV (Engineering) About