scripts/xa-vocal-separation.js — FLAGSHIP: fingerprint separation on a known synthetic mix

Known "vocal" (440 Hz carrier, 5 Hz vibrato ±30 cents, 2 harmonic partials) + known "backing" (110 Hz bass + 4 Hz click train), mixture = sum. processAudioToFingerprints on vocal and mixture → optimizeEqCurves (100 iters) → reconstructVocal on the mixture STFT. Everything below is measured against the known parts. Honest note: the optimizer is plain full-batch gradient descent with NO input normalization — stability requires lr < 1/max(|STFT|)². At the natural signal levels here that means lr = 1e-5 (node-verified: lr 0.01 slams the EQ against its clip bounds and the loss oscillates 294↔594 instead of descending).

suppression scoreboard

mixture |STFT| (0–2.6 kHz)

reconstruction |STFT| (0–2.6 kHz) — bass line + clicks should fade, vibrato partials stay

optimizer loss (external evaluation at checkpoints k = 0,10,…,100)