Verification & tolerances
Every numerical claim Pleco-Xa makes is pinned by committed reference
fixtures and enforced in CI on every push. The fixtures are frozen JSON
under tools/goldens/
(reference ground truth captured at a known-good point); the suites that
replay them live in
packages/pleco-xa/tests/goldens/.
Everything is public and reproducible with no special tooling:
npm ci && npm testThat runs the full library suite — 47 test files, 420 tests, of which 21
are fixture-gated golden suites (183 tests) plus the loop-point golden lock on
real audio. The assertion helper (tests/goldens/helpers.js) uses the
elementwise criterion |actual − expected| ≤ atol + rtol·|expected| with
defaults rtol 1e-5 / atol 1e-8, and reports the worst offender on failure.
Two kinds of numbers appear below: declared tolerances (the gate CI enforces) and achieved margins (what the current implementation actually measures, recorded where it was captured). Achieved is typically orders of magnitude inside declared.
Declared tolerances
Section titled “Declared tolerances”| Domain | Fixture | What is asserted | Tolerance |
|---|---|---|---|
| Window functions (hann/hamming/blackman, periodic) | windows.json | window(n) values, 9 cases | rtol 1e-6, atol 1e-7 |
| FFT/IFFT round-trip | synthetic | real-signal round-trip; complex-input preservation | max err < 1e-5 (real), < 1e-6 (complex) |
| STFT magnitude | stft.json | |stft| matrix + shape (n_fft 512, hop 128) | rtol 2e-3, atol 2e-3 (f32 accumulation) |
| ISTFT round-trip | istft_roundtrip.json | istft(stft(y)) reconstructs y | max err < 1e-3 |
| Mel filterbank | mel_filterbank.json | matrix + shape (htk/norm variants) | rtol 1e-6, atol 1e-8 |
| Mel spectrogram | melspectrogram.json | matrix + shape | rtol 5e-4, atol 1e-4 (measured max rel dev ~1e-4) |
| MFCC | mfcc.json | matrix + shape | atol 1e-3 absolute, dB scale (achieved 8.7e-5) |
| MFCC options / lifter | synthetic | 3 throws; lifter weight law 1+(L/2)sin(π(k+1)/L) | lifter to 12 decimal places |
| Chroma filterbank | chroma.json | filters.chroma matrix + shape | rtol 1e-6, atol 1e-7 (achieved 5.9e-8 rel / 3.0e-8 abs) |
| Chroma STFT (incl. tuning estimation) | chroma.json | chroma_stft matrix + shape | rtol 1e-4, atol 1e-5 (achieved 1.8e-6 rel / 2.5e-7 abs) |
| Chroma failure paths | — | 3 throw-message regexes | exact (throws) |
| Spectral descriptors | spectral_features.json | centroid, bandwidth, rolloff, flatness, contrast, rms, zcr | centroid/bandwidth rtol 1e-6, atol 1e-3; rolloff bin-exact (atol 1e-9); flatness rtol 1e-6, atol 1e-7; contrast rtol 1e-5, atol 1e-3; rms rtol 1e-6, atol 1e-8; zcr atol 1e-12 |
| Spectral y-path (end-to-end via production f32 stft) | spectral_features.json | centroid + flatness | rtol 2e-3, atol 2e-3 (wave-level target) |
| Spectral failure paths | — | 3 throw-message regexes | exact (throws) |
| Onset strength envelope | onset_strength.json | options-style and positional call vs expected; length | rtol 1e-4, atol 1e-5 (measured max abs dev ~6e-6) |
| Onset lag padding | onset_strength.json | first 3 envelope samples are structural zeros | exact |
| PCEN | pcen.json | pcen(melspectrogram(y)) matrix + shape | rtol 1e-3, atol 1e-4 (achieved 1.8e-7 end-to-end) |
| F0 harmonics | f0_harmonics.json | interpolated harmonic output (±Inf restored from JSON null) | rtol 1e-5, atol 1e-5 |
| Tempogram ratio | tempogram_ratio.json | shape [13,173]; values; tg-path == y-path; aggregate collapse | values rtol 2e-3, atol 2e-3 (measured 7.6e-8 abs); path equivalences bit-identical; aggregate to 12 dp |
| Tempogram ratio failure paths | synthetic | 9 invalid-input throws with message regexes | exact (throws) |
| Tempo estimation | tempo_beats.json | tempo(y) vs expected | rel error < 0.02 (gate); achieved bit-exact lag-bin (10 dp) |
| Beat tracking | tempo_beats.json | tempo; beat count; beat frames | tempo rel < 0.02; count exact; each beat ±1 frame (gate); achieved exact frames |
| Beat tracking (onset-envelope path) | tempo_beats.json | beats from {onsetEnvelope} | exact |
| Beat tracking (class wrapper) | tempo_beats.json | BeatTracker.beatTrack tempo + beats | tempo to 10 dp; beats exact |
| Beat conversions / silence / failures | tempo_beats.json + synthetic | frames→samples/time identities; silence → 0 BPM + []; 6 invalid-input throws | exact |
| pYIN pitch tracking | pyin.json | shapes/types; voiced_prob ∈ [0,1]; voicing classification; NaN/finite f0 contract; 2 throws | voicing exact except transition frames (frames whose expected voicing differs from a neighbor — genuinely ambiguous at pitch/silence boundaries); voiced f0 deviation < 1.0 semitone (achieved grid-exact) |
| HPSS (magnitude) | hpss.json | H and P at margin 1.0 and 2.0; shape | rtol 1e-3, atol 1e-4 (achieved 1.61e-5 / 1.41e-5 abs) |
| HPSS invariants | hpss.json | H+P==S at margin 1; masks ∈ [0,1] summing to 1; complex phase carried | reconstruction < 1e-9; mask sum to 9 dp; complex < 1e-9 |
| Softmask | synthetic in-suite case (fixed 3×3 matrices) | power=2 values; power=∞ hard mask; throws | to 6 dp; hard mask exact |
| Waveform-level HPSS | hpss.json | yh+yp≈y (interior); harmonic()/percussive() agree with hpss() | interior worst < 1e-3; agreement to 9 dp |
| Trim / split | effects.json | trim index + slice identity; split intervals; silent-signal empty slice | exact (integer indices/intervals) |
| Preemphasis / deemphasis | effects.json + synthetic | output; round-trip; block streaming via zf | rtol 1e-5, atol 1e-6 (achieved 5.96e-8 abs); round-trip < 1e-4; streaming to 6 dp |
| Phase vocoder | phase_vocoder.json | shape; per-bin magnitude; complex values vs spectral peak | magnitude |Δ| ≤ 1e-3 + 1e-3·|expected| (achieved 9.3e-4 worst ratio); complex |Δz| ≤ 1e-3·peak + 1e-3·|z| (achieved ≤ 4.9e-4 of peak) |
| time_stretch / pitch_shift contracts | phase_vocoder.json (input reuse) | length == round(n/rate); all finite; duration preserved; throws on rate ≤ 0 | exact |
| Remix ordering | synthetic | interval order preserved; zero-crossing snap bounds; out-of-bounds throws | order exact; with align_zeros the first sample must land on a zero crossing of the test sine (|out[0]| < 0.07, one sample-step of amplitude) and length stays within crossing-snap jitter of the 400-sample input (∈ (390, 400]) |
| Dynamic time warping | dtw_segment.json | cumulative cost D[-1][-1]; warping path; backtracking | D rel < 1e-6; path exact; backtrack exact |
| Recurrence / segmentation | dtw_segment.json | connectivity 0/1; affinity; recurrence_to_lag; agglomerative boundaries | connectivity/lag/boundaries exact; affinity rtol 1e-5, atol 1e-8 |
| Laplacian segmentation (two-feature form) | laplacian_seg.json | boundaries; segment count; determinism; degenerate-bandwidth throw; 2 input throws | exact (permutation/sign invariant) |
| RQA alignment | rqa.json | path; score | path exact; score to 10 dp |
| RQA degenerate paths | synthetic | negative gap throws; all-zero matrix → empty path | exact |
| HMM transition matrices | sequence_extra.json | uniform/loop/cycle/local matrices; rows sum to 1 | atol 1e-6; row sums to 6 dp |
| Discriminative Viterbi | sequence_extra.json | integer state path; Bayes-direction pin | exact |
| Sequence regression guards | synthetic | cycle self-prob on diagonal; 6 constructor throws | diagonal to 12 dp; throws exact |
| Symmetric eigendecomposition | linalg.json | eigenvalues ascending + values; reconstruction V·diag·Vᵀ; orthonormality VᵀV=I; 1×1 case | eigenvalues rtol 1e-6, atol 1e-6; reconstruction atol 1e-9; orthonormality atol 1e-9; flat-vs-2D atol 1e-12 |
| Normalized graph Laplacian | linalg.json | L matrix; eigh(L) eigenvalues; diagonal conventions | L atol 1e-9; eigenvalues rtol 1e-6, atol 1e-6; diagonal to 12 dp |
| K-means clustering | cluster.json | canonicalized labels; centers; inertia | labels exact; centers atol 1e-2 (fixture is float32); |inertia diff| ≤ 1e-3 |
| K-means determinism / typing | cluster.json + synthetic | same-seed determinism; Int32Array labels; 5 invalid-input throws | exact |
| Unit conversions (Hz/mel/MIDI/dB/frames/time/samples) | conversions.json | 11 conversion functions vs expected arrays | rtol 1e-5, atol 1e-6 |
| Typed-array dispatch | conversions.json | hz_to_mel(Float32Array) finite + values | rtol 1e-5, atol 1e-6 |
| Frequency weighting curves A/B/C/D | weighting.json | *_weighting(frequencies) vs expected | rtol 1e-4, atol 1e-4 |
| FFT bin frequencies | fft_frequencies.json | fft_frequencies(sr, n_fft) vs expected | rtol 1e-6, atol 1e-6 |
Loop acceptance gates
Section titled “Loop acceptance gates”Loop detection — the signature capability — has its own lock, on real WAVs, not synthetic signals.
The golden lock (tests/loop-goldens.test.js vs
tools/goldens/loop_goldens.json): loop.detect(…, { strategy: 'fast' }) runs
on four real recordings (48 kHz and 44.1 kHz, 2.6 s to 45 s) and the detected
loop start and end must each land within ±441 samples (±10 ms @ 44.1 kHz)
of the pinned points. A Node spot-run of the same pipeline
(examples/node/loop-fast.mjs)
additionally checks BPM within ±0.1 of the pinned value — and measures all
four files landing within Δ ≤ 1 sample of the pinned loop points, with BPM
exact. Confidence is asserted on the unified 0..1 scale but deliberately
not pinned (the legacy pipeline pegged it at 1.0; the fixture meta
documents this).
Per-strategy quality gates — every strategy throws a diagnostic naming the failed gate; none falls back to another strategy or fabricates a result:
- Input gate (all strategies): buffer must expose
getChannelData(),sampleRate > 0, and a non-empty channel 0 — otherwiseloop.detectrejects. fast: energy-based; golden-locked as above. Effectively-silent signals are rejected by the signal-evidence gate (the message names the RMS threshold). Its BPM stage rethrows naming the failed step — the fabricatedconfidence: 0.5fallback was removed in 2.0.3.precise: tempo gate (no usable tempo estimate → throws; passoptions.bpmto supply one) and candidate gate (no onset pair inside the search window → throws).musical: tempo gate, then candidate gate when no bar length fits the material.recurrence: the only strategy with aminConfidencegate (default 0.1) — a best candidate below it throws, suggesting alternatives. Also an embedding gate on too-short input and a repetition-evidence gate on silence.hopLengthauto-scales to respectmaxFrames(recorded in diagnostics — never a silent strategy switch). Its result carries nobpmfield.- Unknown strategy: throws, listing the four valid names.
Documented edge cases
Section titled “Documented edge cases”These are contracts, verified by the shipped test suite and demonstrated against the published build:
| Input class | Guaranteed behavior |
|---|---|
| Silence / constant signal | tempo() throws (cannot estimate tempo: onset envelope is all zeros); quickTempo throws (no onsets in window); loop.detect rejects naming the RMS threshold. beat_track returns {tempo: 0, beats: []} and onsetDetect returns zero onsets — valid empties, never a fabricated BPM. |
| NaN / ±Infinity in input | Throws naming the offending index and value (e.g. stft: input contains non-finite values at index 7 (value: NaN)). Corrupted audio is never coerced to 0 and laundered into plausible output. |
| Empty input | Throws (y must not be empty; loop.detect: input gate failed — channel 0 is empty). |
| 1-sample input | Valid: framed transforms return a single zero-padded frame (stft → 1025×1, melspectrogram → 128×1, rms → length 1). tempo() throws its all-zeros gate (no onset content exists); beat_track returns {tempo: 0, beats: []}. |
| Wrong sample-rate assumption | sr is never inferred — every analysis call defaults to sr = 22050. Passing 44.1 kHz samples without { sr: 44100 } returns plausible but wrong numbers with no error. Supplying sr is the caller’s side of the contract. |
Plain Array instead of Float32Array | Accepted — numeric Arrays flow through the DSP core (demonstrated on tempo, stft, feature.rms, onset_strength). Float32Array remains the documented input type. |
| 10-minute 44.1 kHz track | Valid: beat_track in ~30 s and melspectrogram in ~22 s under the default Node heap (iterative in-place FFT since 2.0.3 — the former long-input crash is eliminated). |
What this does not cover
Section titled “What this does not cover”Honesty section:
- Fixtures are frozen snapshots. They pin the output at a known-good point and prove non-regression against that reference ground truth — they are not a live cross-check against any external system. Regeneration is a maintainer operation, done only when an intentional algorithm change invalidates the pinned values, and reviewed like code (fixture policy).
- Loop confidence values are range-asserted, not pinned — a documented decision recorded in the fixture metadata.
- Performance numbers are measurements, not CI gates. The 10-minute-track timings above were measured on a development machine and are stated as observations.
- Browser-only surfaces (canvas rendering, playback transport) are exercised by unit and demo tests, not numerical fixtures — there is no meaningful reference ground truth for pixels and scheduling.