Jim Gilsinan IV - Senior Thesis

Yong Jiu Fa Yin: A Simple Mandarin Chinese Tone Recognizer

Advisor: Michael S. Brandstein

Thesis Final Draft

Pitch Tracking

Used "Speech Filing System" for Win32 from the University College London Department of Phonetics and Linguistics to track pitch (fundamental frequency) in the samples below. From the sample image directly below, we can see the four tones of Mandarin (marked 1, 2, 3, and 4) in both waveform and pitch-track view.

er


Code

Simple tone estimator that attempts to compare a specified pitch-track file containing any number of syllables with a specified tone pitch template file containing the four tones in order. Normalizes syllable picth tracks around the mean pitch, expands the tracks to a standard length with linear interpolation, and then compares the pitch track with the template, returning the best match using the specified statistic regression and specified number of cepstral regions. Currently implemented are mean-squared distance analysis (no cepstral regions) and best fit linear analysis.

Make: make, make make_template
Tone Usage: ./tone {test pitch track file} {template pitch track file} {regression} {cepstral regions per syllable}
Regressions:

Template Usage: ./make_template {output file} {sample file} {sample file} [{sample file} ...]

Combined template files


Samples from the Oregon Graduate Institute Multi Language Telephone Speech Corpus

Samples, Male

Provided are wav files of all speech samples. These wav files are modified versions of the original wav files compiled from sessions with native Mandarin speaking volunteers. Specifically, the noise level has been reduced and some of the more egregious exhales at the ends of syllables have been cut out to increase pitch tracker clarity. All accomplished with Cool Edit Pro.

Also provided are selected waveform and pitch-track jpeg files, as explained above.

The wav files might require a right click / save as to play; then again, they might not, depending on your browser.

Zhongjue Chen

Syllables

Files contain four of the same syllable with each of the tones: First Tone, High; Second Tone, Rising; Third Tone, Falling then Rising; Fourth Tone, Falling.

Phrases

Samples, Female

Tsiyun (Cherry) Fu

Phrases

Copyright © 2000-2001 Jim Gilsinan IV ( gilsinan@post.harvard.edu )
Last Update 30 March 2001