pyin
pyin(
y,fmin,fmax,sr?,opts?):object
Defined in: packages/pleco-xa/src/scripts/xa-pitch.js:237
Probabilistic YIN (pYIN).
Two-stage pYIN pipeline:
- YIN cumulative-mean-normalized-difference per frame → local minima (troughs) below a beta-distributed threshold ensemble, each weighted by a Boltzmann prior over trough rank and the beta pmf over thresholds → an observation matrix over a log-spaced f0 grid (n_bins_per_semitone bins/semitone) stacked with an unvoiced state block.
- Transition matrix = transition_local band over the pitch grid ⊗ voiced/unvoiced switching (transition_loop(2, 1 - switch_prob)) via a Kronecker product np.kron(t_switch, transition).
- sequence.viterbi decode → per-frame pitch bin → f0 (fill_na when unvoiced), voiced_flag, voiced_prob.
Validated against committed reference fixtures (220→330 Hz step + silent tail; voiced f0 within ~1 semitone, voicing exact on the clearly voiced/ silent regions). This is the real pYIN — NOT the former median-over- threshold-ensemble stub (no transition matrix, no Viterbi) that was honestly left unexported.
Parameters
Section titled “Parameters”number[] | Float32Array<ArrayBufferLike>
Audio time series.
number
Minimum frequency (Hz), > 0.
number
Maximum frequency (Hz), fmin < fmax <= sr/2.
number = 22050
Sample rate (Hz).
beta_parameters?
Section titled “beta_parameters?”[number, number] = ...
Beta prior (a, b).
boltzmann_parameter?
Section titled “boltzmann_parameter?”number = 2
Boltzmann prior over troughs.
center?
Section titled “center?”boolean = true
Center-pad frames (default).
fill_na?
Section titled “fill_na?”number = NaN
Value written to unvoiced f0 frames.
frame_length?
Section titled “frame_length?”number = 2048
hop_length?
Section titled “hop_length?”number = null
Defaults to frame_length/4.
max_transition_rate?
Section titled “max_transition_rate?”number = 35.92
Max transition (oct/sec).
n_thresholds?
Section titled “n_thresholds?”number = 100
Threshold-ensemble size.
no_trough_prob?
Section titled “no_trough_prob?”number = 0.01
Best-guess mass when no trough.
resolution?
Section titled “resolution?”number = 0.1
Pitch-bin resolution in semitones.
switch_prob?
Section titled “switch_prob?”number = 0.01
Voiced↔unvoiced switch prob.
Returns
Section titled “Returns”object
f0:
Float64Array
voiced_flag
Section titled “voiced_flag”voiced_flag:
boolean[]
voiced_prob
Section titled “voiced_prob”voiced_prob:
Float64Array