Real-time

Spectral flux

onset strength
frequency-domainlow-latencypolyphoniconset strength

An envelope detector traces the loudness contour of a waveform — the slow outline riding over the fast carrier inside it. Every graph on this page is drawn by the method's real algorithm, and the sliders at the top drive all of them at once.

The whole method, live

Spectral flux
onset strengthpolyphonic
Spectral flux
Sensitivity60 γ
Smoothing8 samp (0.2 ms)

Score card

Causality
low-latency
Signal model
polyphonic
Reads
onset strength
Latency
≈1 frame
Cost
STFT
Domain
frequency

Scored qualitatively.

This method outputs a normalized contour (onset strength, per-band or perceptual loudness), not an amplitude in the units of the true envelope — so an amplitude error number would be meaningless. Its strength is the spectral axis: read the gallery below.

How it works

Where music software finds the beat. Take the STFT and sum the frame-to-frame increases in magnitude across all bins — a half-wave-rectified spectral difference. Energy rising anywhere — a new note, a drum hit — produces a peak, so it flags onsets no matter how many voices overlap.

This onset-strength envelope is the front end of nearly every beat-tracker and tempo estimator. The sensitivity control sets the log-compression, i.e. how much quiet onsets count relative to loud ones.

Key terms

STFT
The short-time Fourier transform — the spectrogram. It slices the signal into short overlapping frames and reports a magnitude per frequency bin per frame, so you can see how the spectrum changes over time.
Spectral flux
The frame-to-frame increase in magnitude, summed across all bins: Σ max(0, |X[k]| − |X_prev[k]|). It measures how much new energy appeared since the last frame.
Half-wave rectification
The max(0, ·) step that keeps only the rises and discards the falls — so only energy increases count toward an onset. A note ending should not look like a note starting.
Onset strength
The resulting curve. Its peaks mark onsets — the front of each note or drum hit — which is exactly what the front end of a beat-tracker feeds on. The sensitivity control sets the log compression: how much quiet onsets count relative to loud ones.

Building the envelope, step by step

Flux doesn't follow loudness — it follows change. Each graph below is drawn by the real algorithm on a polyphonic mix, working up to the finished onset-strength curve.

  1. Step 1The raw mix

    Start with the polyphonic input — several voices overlapping, with no single carrier. Amplitude alone won't tell you where the hits are: a sustained chord can be louder than the snare that lands on top of it.

  2. Step 2Onset strength

    For each frame, compare its spectrum to the previous one, keep only the bins that got louder, and sum that rise across all bins. Steady tones contribute nothing; new energy anywhere spikes the curve. The result is a spiky onset-strength contour — one peak per attack — laid over the dimmed mix.

The code

Six readable forms of the exact algorithm that draws the curves above — C, JS and Python ports, an optimized C, a fixed-coefficient version, and a user-controlled one whose parameters match the sliders.

#include <math.h>

/* Provided by the shared DSP layer: an STFT magnitude spectrogram.
   mag is [B][M] — B = FRAME/2+1 bins, M frames; a Hann window and the
   FFT live inside it. We only write the flux core here.
     void   stft(const double *x, int n, double **mag, int *B, int *M);
     void   norm_max(double *a, int m);                  // divide by peak
     void   up_frames(const double *fr, int m, double *env, int n); // -> sample rate */

/* Spectral flux: for each frame, sum over bins the positive (half-wave-
   rectified) increase in log-compressed magnitude from the previous frame.
   gamma sets the log compression — how much quiet onsets count. */
void spectral_flux(double **mag, int B, int M, double gamma, double *flux) {
    flux[0] = 0.0;
    for (int m = 1; m < M; m++) {
        double s = 0.0;
        for (int k = 0; k < B; k++) {
            double d = log1p(gamma * mag[k][m]) - log1p(gamma * mag[k][m - 1]);
            if (d > 0.0) s += d;          /* half-wave rectify: rises only */
        }
        flux[m] = s;
    }
    norm_max(flux, M);                    /* normalize onset strength to peak */
}