· Ed Dowding · Portfolio · 2 min read
Learn to Read
Speech-driven reading companion that highlights words in real-time as you read aloud. Built as a lightweight PWA using the Web Speech API, designed to help young children follow along with stories and build reading confidence.

The Problem
Learning to read is one of the most important milestones in a child’s life, but the gap between listening to a story and reading independently is enormous. Parents reading with their children need a simple way to show which word is which - connecting the spoken word to the written one in real time. Existing reading apps are either expensive, bloated with gamification, or require accounts and subscriptions for what should be a straightforward interaction.
What I Built
A single-page progressive web app that listens via the microphone as a parent (or child) reads a story aloud, and highlights each word as it’s spoken. The effect is immediate and intuitive - words fade from grey to green as you read through the story, creating a visual trail of progress.
Real-Time Word Tracking
The app uses the browser’s native Web Speech API to capture interim speech recognition results and matches them against the known story text. A fuzzy matching algorithm tolerates mispronunciation, slight word skips, and recognition errors while maintaining strict sequential order - so common words like “the” don’t cause it to jump ahead.
Minimal, Child-Friendly Design
Warm cream background, large serif text, generous line spacing. No accounts, no ads, no distractions. Tap start, read aloud, watch words light up.
Tech Stack
Vanilla JavaScript, Web Speech API, CSS transitions, PWA manifest. No frameworks, no build step, no API keys. Deployed as static files to Cloudflare Pages.
Lessons Learned
Browser Speech APIs Are Inconsistent: The Web Speech API works reliably in Safari but hits network errors in Chrome on macOS. The recognition service sends audio to remote servers, and connection issues are silent failures. Lesson: for a production version, a dedicated speech API like Deepgram would provide consistent cross-browser behaviour.
Sequential Matching Beats Greedy Matching: Early versions used a lookahead window that matched spoken words against any upcoming story word. Common words like “the” appeared multiple times, causing the tracker to jump ahead. Switching to strict sequential matching with a small skip tolerance (for misheard words) solved it completely. Lesson: constrain the algorithm to match the user’s mental model - reading is linear.
Speech Recognition Latency Is The UX Bottleneck: The Web Speech API buffers 300-500ms of audio before returning interim results. For a read-along app, this lag is noticeable. The fix isn’t faster recognition - it’s designing the visual feedback to feel smooth despite the delay. CSS transitions on the word highlights mask the latency effectively.