Comments on:

AES Technical Document AESTD1001.1.01-10

Multichannel surround sound systems and operations


Comments on Sections 1-4:

Center channel speaker and basic music surround sound configuration

The document is intentionally biased towards existing international standards, which deal with surround sound primarily in the context of reproduction of film sound in the home. It is an accommodation, when applied to high-resolution surround sound, that results in sub-optimal performance of a new technology. The goal of the standardized system setup is to transfer much of the movie cinema experience to the home environment. In a cinema the multiple loudspeakers behind the wide projection screen are essential to carry dialogue and to follow moving images. The single center speaker in conjunction with left and right speakers duplicates this in the home theater setup. The side and rear speakers in the cinema add sound effects, when needed, and enhance the emotional impact. The two side speakers take up this function in the home. Thus, the 3/2 layout of speakers for the home environment preserves most of the features of a movie theater.

The center channel speaker in a home theater setup is usually of different construction compared to left and right speakers, because rear projection of the image limits its placement options to either below or above the screen. For a close tie-in with the picture location the speaker has a horizontal profile and is relatively small. The performance of the center speaker is usually compromised in terms of low frequency extension, non-linear distortion and polar response compared to other speakers in the surround sound system. Front projection of the picture onto a perforated screen allows the use of a center speaker that is identical to L and R, except for some high frequency response shaping. Front projection, though, is only used in expensive home theater setups and dedicated rooms.

The goal of a high resolution music surround sound system is, amongst other things, the recreation of a coherent and realistic acoustic space, within which individual instruments and groups of sound sources are placed. To preserve high-resolution sound during playback the level of intermodulation distortion has to be kept exceedingly low. This demands greater commonality between speakers than necessary for home theater where the picture dominates. Especially the center speaker should not differ from left and right front speakers.

In a home music system setup it is preferable not to have a center speaker at all. To obtain the quality that high-resolution audio can provide, the left and right speakers will, of necessity, be relatively large. To also place a speaker of this size into the center will be unacceptable in the majority of domestic rooms. Furthermore, today's high quality two channel systems and recording techniques a very capable of creating solid center sound images. The home theater downwards compatible music surround sound system could thus have a 2/2 configuration and not use the center speaker.

The same 2/2 configuration could also be the first upgrade from a two channel sound system, after an identical pair of speakers has been added to the two existing ones. If smaller speakers are added, then some loss in potential reproduction quality must be expected, unless only program material with low level ambient information is reproduced by the side speakers.

The standardized 3/2 configuration is not adequate to fully exploit high resolution surround sound. A six speaker configuration seems to be the minimum. It would consists of two front speakers (Lf & Rf), two side speakers forward of the listener(s) (Ls & Rs), and two rear speakers (Lr & Rr) behind the listener(s). The front side speakers could be elevated to assist in creating a sense of realistic vertical extension of the sound space. Optimum speaker placement angles will have to be determined, and would be around +/-30, +/-60 and +/-120 degrees. The configuration might be called 4/2, to follow existing convention, or 2/2/2 for greater differentiation.
The 2/2 speaker configuration would be a subset of this and be driven from the Lf, Rf and Lr, Rr outputs with signals derived from the six data channels. This configuration will make the entry into high-resolution audio readily available to a wide audience.

 Proposal 1:



Comments on Section 5:

Full-range loudspeakers and LFE

It must be clearly stated that six identical, full-range speakers should be used, to realize the full potential of the high-resolution surround sound format. Full-range in this context means a frequency response from 20 Hz to 20 kHz, with specified uniformity over a +/-45 degree window, and a multitone intermodulation distortion free range of 60 dB at 100 dB SPL.

Such a requirement is obviously in conflict with the desire to have six small loudspeakers, or to have speakers that are unobtrusive. A common solution is likely to be a set of six small speakers that cover high and mid frequencies down to 80 Hz, and two separate woofers for the 80 Hz to 20 Hz range. The use of a single woofer should be discouraged as it is unlikely to meet the distortion requirements at very low frequencies. A single woofer also causes too uniform an excitation of room resonances and encourages boomy bass reproduction.

The single "subwoofer" of existing 3/2 or 5.1 systems, that is driven from the LFE channel, should be augmented with a second unit for both home theater and music surround.

Proposal 2:



Comments on Section 6:

Reverberation time, Directivity index, Nonlinear distortion attenuation, Transient fidelity, System dynamic range

The suggested parameters and values for reference listening conditions should be revisited for their applicability to high-resolution surround sound. Several of them stand out to me.

1 - Reverberation time

The suggested values for reverberation time are very low and lead to acoustically dead rooms. They certainly do not represent playback conditions in domestic rooms which have decay times that are by about a factor of two larger.

2 - Directivity index

The suggested value of 8 dB +/-2 dB over the 250 Hz to 16 kHz is unrealistic. Below 250 Hz the index is likely to be 0 dB, because typical speakers are omni-directional, Dipoles and cardioids have a value of 5 dB, but are rarely used for low frequencies. The index increases to over 10 dB for most speakers at high frequencies. It would be desirable to have a more constant directivity index, for improved spectral balance of the reverberant sound. It requires that speakers be either omni-directional, dipole or cardioid over the major portion of the frequency range.

3 - Nonlinear distortion attenuation

It is not apparent whether the suggested numbers refer to harmonic distortion or all types of distortion. The very essence of high-resolution audio is the absence of audible nonlinear distortion. Much investigation still remains to be done in this area. I can say from experience that the suggested values are too high, no matter how they are defined.

4 - Transient fidelity

This is a very peculiar looking spec. 5 ms decay time at 1 kHz is at least by an order of magnitude too large, before it even might begin to have some meaning.

5 - System dynamic range

The parameter would more correctly be called "Maximum SPL". The value >112 dB needs to be qualified for distortion level and frequency range.

System dynamic range, defined as the difference between maximum SPL and multitone distortion level, should be specified. For example, 60 dB range at 100 dB SPL for 20 Hz to 20 kHz multitones.

Proposal 3:


Siegfried Linkwitz

2 February, 2002