Skip to content

pyin

pyin(y, fmin, fmax, sr?, opts?): object

Defined in: packages/pleco-xa/src/scripts/xa-pitch.js:237

Probabilistic YIN (pYIN).

Two-stage pYIN pipeline:

  1. YIN cumulative-mean-normalized-difference per frame → local minima (troughs) below a beta-distributed threshold ensemble, each weighted by a Boltzmann prior over trough rank and the beta pmf over thresholds → an observation matrix over a log-spaced f0 grid (n_bins_per_semitone bins/semitone) stacked with an unvoiced state block.
  2. Transition matrix = transition_local band over the pitch grid ⊗ voiced/unvoiced switching (transition_loop(2, 1 - switch_prob)) via a Kronecker product np.kron(t_switch, transition).
  3. sequence.viterbi decode → per-frame pitch bin → f0 (fill_na when unvoiced), voiced_flag, voiced_prob.

Validated against committed reference fixtures (220→330 Hz step + silent tail; voiced f0 within ~1 semitone, voicing exact on the clearly voiced/ silent regions). This is the real pYIN — NOT the former median-over- threshold-ensemble stub (no transition matrix, no Viterbi) that was honestly left unexported.

number[] | Float32Array<ArrayBufferLike>

Audio time series.

number

Minimum frequency (Hz), > 0.

number

Maximum frequency (Hz), fmin < fmax <= sr/2.

number = 22050

Sample rate (Hz).

[number, number] = ...

Beta prior (a, b).

number = 2

Boltzmann prior over troughs.

boolean = true

Center-pad frames (default).

number = NaN

Value written to unvoiced f0 frames.

number = 2048

number = null

Defaults to frame_length/4.

number = 35.92

Max transition (oct/sec).

number = 100

Threshold-ensemble size.

number = 0.01

Best-guess mass when no trough.

number = 0.1

Pitch-bin resolution in semitones.

number = 0.01

Voiced↔unvoiced switch prob.

object

f0: Float64Array

voiced_flag: boolean[]

voiced_prob: Float64Array