Built a pipe organ synth in 3 weeks because Egypt only has 6

There are six pipe organs in all of Egypt. Six. For 100 million people.

So I built one. Three weeks later: fully polyphonic synthesizer, every sound generated from pure math in real-time, and a genuine question about why commercial organ plugins cost $400.

Samples felt like cheating

Every digital organ you've heard is probably sample-based. Someone recorded a real pipe organ, captured every note on every stop, packaged it. Press key, play recording.

Didn't want that. Learning objective aside, samples have a real limitation: they capture steady state. They miss the first 100ms when a note starts.

Real pipe organs are mechanical. Press a key, valve opens, wind rushes in, pitch stabilizes. There's a transient called chiff as the pipe "speaks." Plus a tiny mechanical click from the valve. These aren't artifacts. They're what makes an organ sound like an organ.

Sample libraries try to capture this, but you can't reshape it afterward. With synthesis, it's just a parameter.

How pipes actually make sound

Every stop is a different harmonic recipe. Air vibrates inside a pipe, produces fundamental frequency plus harmonics at integer multiples. What makes a Flute different from a Principal isn't mysterious. The Flute has almost no overtones because the stopped pipe physically suppresses them. The Principal has balanced harmonics. A Reed has a dense harmonic series approaching a sawtooth, which is why it sounds buzzy.

Synthesizing these: figure out right harmonic amplitudes for active stops, spin up oscillators at those frequencies, add them. That's the engine.

Organum runs up to 18 oscillators per voice, two per drawbar position (9 positions plus a slightly detuned chorus partner for each). The chorus creates the beating pattern characteristic of string stops, adds subtle width to everything else. At 8 simultaneous voices that's potentially 144 oscillators. On my machine it takes 45ms to render a 4096-sample block. Budget is 93ms. Fine.

The envelope rabbit hole

ADSR envelopes seem boring until you try implementing them without clicks.

Linear attack has a discontinuous derivative at note onset. Slope goes from 0 to some positive value instantly, ear hears that as a click. Fix is a smoothstep curve, 3t² − 2t³, which has zero slope at both endpoints. Level eases in from silence with no abrupt slope change. No click.

More interesting problem: retrigger. What happens when you re-press a key already in release?

If you restart attack from 0, level jumps from wherever the release was down to 0, then attack begins. Click. I added a dedicated retrigger stage: 15ms crossfade using same smoothstep, going from current release level back to sustain. Level transition is smooth, slope is continuous. Voice doesn't restart, oscillator phases preserved, note just resurfaces from wherever it was.

Completely inaudible when it works. That's the goal.

The reverb

Dry pipe organ sounds like a keyboard in a recording booth. The room does huge perceptual work making it feel real.

Reverb is a stereo pair of Schroeder networks. Each channel runs pre-delay, then 7 parallel comb filters summed, then 4 allpass filters to smooth the tail. Left and right channels use different prime-offset delay times so reflections never coincide. If they lined up, stereo image collapses to mono.

Also three tuned sine oscillators simulating room resonance: standing waves that build up in large stone buildings at specific low frequencies. Without them, even a long reverb tail doesn't convey physical weight of a cathedral. With them, the Gothic Basilica preset is unplayable at fast tempos because decay from one chord bleeds into the next. Which is historically accurate. Fast repertoire in large cathedrals wasn't common for exactly this reason.

The gain problem

One note, gain is 1.0. Two simultaneous notes summed directly are twice as loud. Full chord clips hard.

Fix is scaling output by 1/√N where N is active voices. Perceived loudness stays roughly constant as you add notes.

Naive implementation applies this as hard scalar at start of each buffer. Problem: going from 1 voice to 2 voices drops gain by 30% in a single sample. Audible. Every note added is a pop.

Actual fix is ramping gain linearly across entire buffer. Going from 1.0 to 0.707 over 4096 samples is completely inaudible. Ear can't track gradual change that slow. Going from 1.0 to 0.707 in one sample is immediately obvious. So mixer tracks previous gain value and interpolates to new target across every block.

The real-time stuff

Audio callback has a hard deadline: produce 4096 samples every 93ms or sound card glitches. Everything organized around not missing that.

GUI thread never touches audio state directly. All communication goes through lock-free queue of event objects: note on/off, drawbar changes, stop toggles, room presets. Queue gets drained at start of each render.

Everything in audio path is vectorised numpy. No per-sample Python loops anywhere hot. Biggest performance difference was pre-allocating work buffers in the oscillator. Naive implementation was doing roughly 1,000 numpy array allocations per callback. Pre-allocated arrays reused each call dropped that to near-zero.

Harmonic amplitude calculation is also computed once per block in mixer and shared across all voices, rather than each voice redundantly recalculating it.

What it actually sounds like

Good. Genuinely good. The chiff on reed stops has that breathy quality of real pipes speaking. Low chord through cathedral reverb hits with physical weight. Voix Humaine stop with tremulant active sounds like a bad impression of a human voice, which is exactly what the real pipe does too.

I keep loading Gothic Basilica preset and playing Bach chorales at 2am. Not ideal for anyone sleeping nearby. Very good for me personally.

Code is on GitHub. uv sync and uv run python main.py. If you want to hear what this actually does, pull all stops with numpad 0, set room to Gothic Basilica, play something slow.


Built because Egypt has 6 pipe organs for 100 million people. Stayed because continuous derivatives and lock-free queues are more fun than expected.

T
Written by TheVibeish Editorial