· Ed Dowding · Portfolio  · 2 min read

Learn to Read

Speech-driven reading companion that highlights words in real-time as you read aloud. Built as a lightweight PWA using the Web Speech API, designed to help young children follow along with stories and build reading confidence.

Speech-driven reading companion that highlights words in real-time as you read aloud. Built as a lightweight PWA using the Web Speech API, designed to help young children follow along with stories and build reading confidence.

The Problem

Learning to read is one of the most important milestones in a child’s life, but the gap between listening to a story and reading independently is enormous. Parents reading with their children need a simple way to show which word is which - connecting the spoken word to the written one in real time. Existing reading apps are either expensive, bloated with gamification, or require accounts and subscriptions for what should be a straightforward interaction.

What I Built

A single-page progressive web app that listens via the microphone as a parent (or child) reads a story aloud, and highlights each word as it’s spoken. The effect is immediate and intuitive - words fade from grey to green as you read through the story, creating a visual trail of progress.

Real-Time Word Tracking

The app uses the browser’s native Web Speech API to capture interim speech recognition results and matches them against the known story text. A fuzzy matching algorithm tolerates mispronunciation, slight word skips, and recognition errors while maintaining strict sequential order - so common words like “the” don’t cause it to jump ahead.

Minimal, Child-Friendly Design

Warm cream background, large serif text, generous line spacing. No accounts, no ads, no distractions. Tap start, read aloud, watch words light up.

Tech Stack

Vanilla JavaScript, Web Speech API, CSS transitions, PWA manifest. No frameworks, no build step, no API keys. Deployed as static files to Cloudflare Pages.

Lessons Learned

Browser Speech APIs Are Inconsistent: The Web Speech API works reliably in Safari but hits network errors in Chrome on macOS. The recognition service sends audio to remote servers, and connection issues are silent failures. Lesson: for a production version, a dedicated speech API like Deepgram would provide consistent cross-browser behaviour.

Sequential Matching Beats Greedy Matching: Early versions used a lookahead window that matched spoken words against any upcoming story word. Common words like “the” appeared multiple times, causing the tracker to jump ahead. Switching to strict sequential matching with a small skip tolerance (for misheard words) solved it completely. Lesson: constrain the algorithm to match the user’s mental model - reading is linear.

Speech Recognition Latency Is The UX Bottleneck: The Web Speech API buffers 300-500ms of audio before returning interim results. For a read-along app, this lag is noticeable. The fix isn’t faster recognition - it’s designing the visual feedback to feel smooth despite the delay. CSS transitions on the word highlights mask the latency effectively.

Back to Blog

Related Posts

View All Posts »
Mother's Almanac

Mother's Almanac

AI-powered parenting encyclopedia that generates evidence-based guidance on-demand. Built with Next.js 15, Supabase, Claude AI, and a 3-layer caching system with RAG document upload and semantic search.

Contextual Feedback

Contextual Feedback

Open-source React library enabling section-targeted feedback collection. Users visually identify specific UI elements they're commenting on, eliminating vague feedback and providing administrators with precise contextual information.

Moneypenny (WhatsApp AI Desktop Client)

Moneypenny (WhatsApp AI Desktop Client)

Native desktop WhatsApp client with AI-powered message summarisation, priority inbox, and keyboard-first navigation. Built with Tauri (Rust), React, and multi-provider LLM support.

Civic Action Generator

Civic Action Generator

Design novel civic interventions by selecting strategic constraints. An AI tool that generates genuinely innovative ideas for community action—not templates.