Speech-Transcript Alignment
A method for aligning a transcript to recorded speech is described, motivated by the need for alignment of audiobooks for use by SLA software. A combination of text-to-speech technology, mel-spectrum feature extraction, and dynamic time warping are employed to obtain a word-alignment for the input speech sample.