Live Waveform — 48 kHz input signal
LPC-12 ANALYSIS ACTIVE
Vowel Targeting Matrix — F1 / F2 acoustic space
Each circle is a canonical American English vowel (Hillenbrand et al.,
Journal of the Acoustical Society of America, 1995).
The cyan dot is your live F1/F2 position.
The lime arrow shows the direction and
fraction of the correction Goeckoh's engine applies — moving formants toward the nearest
canonical vowel target.
Acoustic Metrics — live readout
Frequency Spectrum — formant markers
Corollary Discharge Mechanism
The 200 ms Prediction Window
Every time your motor cortex issues a speech command, it simultaneously sends an
efference copy — a prediction of what the resulting sound should sound like —
to your auditory cortex. The auditory cortex has roughly 200 milliseconds
to compare what it actually hears against that prediction. If the heard sound arrives within
this window and matches the target, the motor program is reinforced. If the prediction fails
or the signal arrives late, the learning loop does not close.
In ASD and dysarthria, the child produces a distorted vowel, the brain predicts a distorted
vowel, the sounds match — and the broken pattern is reinforced every time. Goeckoh breaks
this cycle by delivering a corrected signal within the window.
0 ms
Motor command — you intend to speak
<50 ms
Goeckoh delivers corrected signal to earbuds*
~200 ms
Prediction window closes — brain finalises motor learning
250 ms
Learning window closed — no motor update possible
* <50 ms refers to the native on-device processing latency of Goeckoh's
DSP engine (Rust-compiled, running directly on the device's audio hardware). This demo shows
analysis only; the correction is delivered through the full application running natively.
Total perceived latency also includes Bluetooth earbud round-trip, which varies by hardware
(typically 40–150 ms). Even at the upper range, correction arrives well within the 200 ms
prediction window.
N1 Suppression: When the correction arrives on time, the auditory cortex
suppresses the N1 evoked potential — an EEG marker indicating the brain accepted the corrected
voice as self-produced. This is the measurable neural signature of the corollary discharge
loop closing correctly. (Niziolek, Nagarajan, Houde,
Journal of Neuroscience, 2013.)
The Neuroscience Behind the Correction
The Inner Dialogue Problem
Before you say a single word, your brain has already heard it.
The moment your motor cortex fires the command to speak, it simultaneously sends a copy of
that command — called an efference copy — to your auditory cortex. Your
auditory cortex uses that copy to generate a precise prediction of what the incoming sound
should be. This predicted signal is what neuroscientists believe underlies your
inner voice — the voice you "hear" in your head when you think in words.
When real sound arrives 100–200 ms later, the brain compares it against the prediction.
If they match: the sound is tagged as self-produced and the comparison signal
(the N1 wave) is suppressed — the brain essentially says "I expected this, move on."
If they don't match: the comparison fires hard, generating a motor-learning error signal
that drives your articulators to self-correct.
This is the loop Goeckoh targets. In children with ASD, dysarthria, or
apraxia, their distorted output is accurately predicted — because the brain has
learned to expect the distorted version. N1 is suppressed. No error signal fires.
No learning happens. The child practices the wrong pattern thousands of times.
Goeckoh delivers a corrected signal within the brain's own comparison window — creating a
deliberate, therapeutic mismatch. The brain hears what the word should sound like
at the exact moment it is comparing prediction to reality. This forces the motor learning
loop open again.
The corollary discharge loop — step by step
1
Motor cortex fires
You intend to say "cup." Motor cortex issues a movement sequence to your lips, tongue, jaw, and larynx — the precise articulatory commands for that word.
2
Efference copy dispatched
Simultaneously, a copy of the motor command — the efference copy — is forwarded to the cerebellum and then to auditory cortex. This is the same signal that generates your inner voice.
3
Auditory cortex generates prediction
Using the efference copy, auditory cortex generates a forward model: a prediction of exactly what acoustic signal should arrive when the articulators execute those commands.
4
Real sound arrives (~100–200 ms)
The actual speech signal arrives at the cochlea. Auditory cortex receives it and subtracts the prediction from the incoming signal.
── comparison window ──
5a
Match → N1 suppressed (healthy speech)
Prediction minus reality ≈ 0. The auditory N1 ERP is suppressed. The brain labels the sound "self-generated, expected" and discards it. The word is produced normally.
5b
Mismatch → N1 fires → motor learning (Goeckoh's target)
Prediction minus reality ≠ 0. N1 fires strongly. The cerebellum generates a correction signal. Motor cortex updates its forward model for the next attempt. This is how humans learn to speak correctly — and what is broken in dysarthria.
The 200 ms window: Goeckoh's entire correction pipeline — LPC-12 analysis,
Bark-space formant mapping, Hillenbrand target interpolation, and LPC resynthesis — runs in
under 180 ms on the device. This is not arbitrary latency tolerance. It is specifically
engineered to deliver the corrected signal inside the auditory-motor comparison window, so
the brain receives the corrected token as if it were its own natural output.
⬡
N1 Suppression — The Neural Fingerprint
The N1 is an EEG event-related potential (ERP) component that fires ~100 ms after a
sound reaches the ear. In neurotypical speakers, N1 is
suppressed by ~50% for self-produced speech — the brain has already
accounted for it via the efference copy.
In many neurodiverse profiles this suppression is attenuated or absent,
disrupting the brain's ability to evaluate its own output. But the more clinically
significant problem is the inverse: in children who have practiced distorted speech
patterns for years, N1 is suppressed — the brain accurately predicts the
distortion and stops treating it as an error.
Goeckoh creates a controlled, targeted mismatch. N1 fires. The error signal is real.
Motor learning resumes.
Niziolek, Nagarajan & Houde, J. Neurosci. 2013 · PNAS 2024
◎
DIVA Model — Two Feedback Loops
The Directions Into Velocities of Articulators (DIVA) model
(Guenther, Boston University) describes speech motor control as two interacting systems:
Feedforward: Motor programs stored in premotor cortex that execute
familiar words from memory, without monitoring.
Feedback: Auditory and somatosensory loops that continuously compare
output against targets and issue corrections when they diverge.
In dysarthria, feedforward programs are malformed (stored wrong). The feedback loop
cannot override them fast enough. Goeckoh operates entirely within the
feedback integration window — it doesn't change the feedforward program directly;
it changes what the feedback loop hears, forcing the system to re-derive the correct
feedforward program through repeated practice.
Guenther et al., Neural Computation 2006 · Tourville & Guenther, Lang. Cogn. Proc. 2011
Live Acoustic Measurements — what each signal means
♩
F0 — Fundamental Frequency (Pitch)
— Hz
The rate at which your vocal folds open and snap shut. Each full open-and-close cycle
is one period of the glottal wave. Adult male folds close ~116× per second; adult female
~205×; a child's ~223×. That repetition rate is what you hear as pitch.
Goeckoh measures F0 by autocorrelation in the 80–500 Hz range and
preserves it exactly. The correction never moves pitch — your speaker identity and
emotional tone live in F0, and Goeckoh does not touch them.
Adult male: 93–135 Hz · Adult female: 162–238 Hz · Child: ~223 Hz
↕
F1 — First Formant (Tongue Height)
— Hz
The lowest resonance peak of your vocal tract. F1 is shaped by jaw openness
and tongue height. Drop your jaw — F1 rises. Raise your tongue toward the
palate — F1 falls. It's the acoustic signature of how open or closed your mouth is.
In ASD and dysarthria, F1 frequently undershoots because jaw excursion is reduced —
articulatory undershoot. The vowel space collapses toward center.
F1 correction is the single highest-impact intervention Goeckoh makes.
Typical range across vowels: 300–900 Hz
↔
F2 — Second Formant (Tongue Advancement)
— Hz
F2 encodes tongue front/back position. Push your tongue forward and F2
rises — this is how front vowels like /i/ get their "bright" quality. Pull it back and
F2 falls — the rounded /u/ of "boot."
In dysarthria, back vowels are the hardest: both lip-rounding and tongue
retraction reduce simultaneously, collapsing F2 toward front-vowel territory. The acoustic
contrast between front and back vowels disappears. This is measurable in milliseconds and
correctable in real time.
Typical range across vowels: 800–2500 Hz
〰
HNR — Harmonics-to-Noise Ratio
— dB
HNR is the ratio of periodic harmonic energy (clean vocal-fold
vibration) to aperiodic turbulent noise (breathiness, roughness).
A high HNR means your vocal folds are closing cleanly on every cycle. A low HNR
indicates irregular contact — often a sign of vocal fatigue, pathology, or high
emotional-stress state.
Goeckoh's CrystallineHeart engine monitors HNR as part of neurological state
estimation — sustained low HNR is one marker for OVERLOAD state, which gates
correction intensity down to prevent sensory fatigue.
Healthy adult on sustained vowel: ≥11 dB · Pathological threshold: ≤7.4 dB
Ready to hear the difference?
What you just ran in your browser is a window into the same DSP engine running natively in
Goeckoh — at 10× the resolution, with real-time formant correction delivered to the ear.
Get Goeckoh — $20/month
Full application · Android, iOS, macOS, Windows, Linux