Spatial Sound Generation and Perception by Amplitude Panning Techniques

Ville Pulkki

Dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the Department of Electrical and Communications Engineering, Helsinki University of Technology, for public examination and debate in Chamber music hall, Sibelius Academy (Pohjoinen Rautatiekatu 9, Helsinki, Finland) on the 3rd of August, 2001, at 12 o'clock noon.

Spatial audio aims to recreate or synthesize spatial attributes when reproducing audio over loudspeakers or headphones. Such spatial attributes include, for example, locations of perceived sound sources and an auditory sense of space. This thesis focuses on new methods of spatial audio for loudspeaker listening and on measuring the quality of spatial audio by subjective and objective tests.

In this thesis the vector base amplitude panning (VBAP) method, which is an amplitude panning method to position virtual sources in arbitrary 2-D or 3-D loudspeaker setups, is introduced. In amplitude panning the same sound signal is applied to a number of loudspeakers with appropriate non-zero amplitudes. With 2-D setups VBAP is a reformulation of the existing pair-wise panning method. However, differing from earlier solutions it can be generalized for 3-D loudspeaker setups as a triplet-wise panning method. A sound signal is then applied to one, two, or three loudspeakers simultaneously. VBAP has certain advantages compared to earlier virtual source positioning methods in arbitrary layouts. Previous methods either used all loudspeakers to produce virtual sources, which results in some artefacts, or they used loudspeaker triplets with a non-generalizable 2-D user interface.

The virtual sources generated with VBAP are investigated. The human directional hearing is simulated with a binaural auditory model adapted from the literature. The interaural time difference (ITD) cue and the interaural level difference (ILD) cue which are the main localization cues are simulated for amplitude-panned virtual sources and for real sources. Psychoacoustic listening tests are conducted to study the subjective quality of virtual sources. Statistically significant phenomena found in listening test data are explained by auditory model simulation results. To obtain a generic view of directional quality in arbitrary loudspeaker setups, directional cues are simulated for virtual sources with loudspeaker pairs and triplets in various setups.

The directional qualities of virtual sources generated with VBAP can be stated as follows. Directional coordinates used for this purpose are the angle between a position vector and the median plane (θcc), and the angle between a projection of a position vector to the median plane and frontal direction (Φcc). The perceived θcc direction of a virtual source coincides well with the VBAP panning direction when a loudspeaker set is near the median plane. When the loudspeaker set is moved towards a side of a listener, the perceived θcc direction is biased towards the median plane. The perceived Φcc direction of an amplitude-panned virtual source is individual and cannot be predicted with any panning law.

