Document Type

Voice and Speech Directivity

Publication Date

11-2019

Abstract

Speech directivity describes the angular dependence of acoustic radiation from a talker’s mouth and nostrils and diffraction about his or her body and chair (if seated). It is an essential physical aspect of communication affecting sounds and signals in acoustical environments, audio, and telecommunication systems. Because high-resolution, spherically comprehensive measurements of live, phonetically balanced speech have been unavailable in the past, the authors have undertaken research to produce and share such data for simulations of acoustical environments, optimizations of microphone placements, speech studies, and other applications. The measurements included three male and three female talkers who repeated phonetically balanced passages in an anechoic chamber at normal speech levels. Each sat on a chair connected to a subject-rotation system, with his or her mouth at the circular center of a 1.22 m radius, semi-circular array of 36 microphones having Δ𝜃 = 5° polar-angle increments. The array did not include a microphone at 𝜃 = 180°. The talker mouth axis initially aligned toward 𝜃 = 90° in the polar angle (0° elevation) and ϕ = 0° in the azimuthal angle. Azimuthal rotations progressed in Δϕ = 5° increments, meaning the measurements included data from 2,521 unique positions over a sphere. Additional measurements at three positions within the rotating reference frame facilitated signal processing. Several steps mitigated the effects of repeated speech variations, including the calculation of transfer functions, effective coherent output spectra, and other quantities. The results are available under Additional Files and may receive periodic updates.

Comments

Many additional details of the measurement and signal processing procedures appear in the following reference:

T. W. Leishman, S. D. Bellows, C. M. Pincock, and J. K. Whiting, "High-resolution spherical directivity of live speech from a multiple-capture transfer function method," Journal of the Acoustical Society of America, 149(3), 1507-1523 (2021). https://doi.org/10.1121/10.0003363

While higher resolutions also result from the raw measurement data, the files presented here are in 1/3-octave bands from 100 Hz to 10 kHz and based on degree-10 spherical harmonic expansions. The coefficients followed from least-squares approximations. The directivities have been symmetrized about the median plane in the spherical-harmonic domain, effectively doubling the number of subject averages. Resampled continuous expansions realigned the mouth axis at the 𝜃 = 0° and ϕ = 0° angles for the files. The poles thus rotated by 90° so that the original zenith falls at (𝜃, ϕ) = (90°, 0°) and the original nadir falls at (𝜃, ϕ) = (90°, 180°).

The recommended Common Instrument Format (CIF) file opens using the CLF Group CIF viewer, currently available at http://www.clfgroup.org/cif.htm. For other file formats, contact the authors at directivity@byu.edu. The MP4 file animates the directivity balloons rotating 360° for each 1/3-octave band within the original coordinate system.

As indicated earlier, these data should be helpful for many applications and are presented here for general usage. The authors request that users cite the work as given in the JASA reference above and under Recommended Citation.

Average_Speech.CI2 (384 kB)
CIF directivity file

SpeechDirectivity.csv (341 kB)
Generic directivity file

DirectivityAnimation.mp4 (111892 kB)
Rotating directivity balloons

Average Talker Balloons.png (357 kB)
Several directivity balloons for the average talker

KEMAR Balloons.png (362 kB)
Several directivity balloons for the GRAS KEMAR 45BC head-and-torso simulator

Share

COinS