One-second acoustic samples were extracted from the mid-portion of sustained /a/ vowels produced by 50 black and 50 white adult males. Each vowel sample from a black subject was randomly paired with a sample from a white subject. From the tape-recorded samples alone, both expert and naive listeners could determine the race of the speaker with 60% accuracy. The accuracy of race identification was independent of the listener's own race, sex, or listening experience. An acoustic analysis of the samples revealed that, although within ranges reported by previous studies of normal voices, the black speakers had greater frequency perturbation, significantly greater amplitude perturbation, and a significantly lower harmonics-to-noise ratio than did the white speakers. The listeners were most successful in distinguishing voice pairs when the differences in vocal perturbation and additive noise were greatest and were least successful when such differences were minimal or absent. Because there were no significant differences in the mean fundamental frequency or formant structure of the voice samples, it is likely that the listeners relied on differences in spectral noise to discriminate the black and white speakers.
KEY WORDS: Jitter, shimmer, harmonics-to-noise ratio, vocal quality, voice perception
Submitted on April 1, 1993
Accepted on January 4, 1994