The determination of:
Optimum Frequency Response Curves in the Bass Range
A study of the audibility of different bass alignments by Ingvar Ohman.
First published in Musik och Ljudteknik, Sweden.
A. History and listening references
The purpose of this study was to determine, with listening tests in an objective situation, what divergence from an optimum frequency response curve in the bass range is inaudible. It was also desired to determine some type of subjective description of the "sound" from different divergencies from the "optimum curves". This last task resulted in two additional sets of curves. These two curves show how to minimise the subjective distortion from a dynamical and a statical point of view, despite greater measurable reduction in the bass output.
In the case when you, for one reason or another, want to design a speaker system with a response curve that deviates from the "inaudible curves" (e.g. to gain box size, efficiency or power capacity), I would suggest a compromise between both the dynamical and the statical characteristics. (Except, of course, if you request a colouration from the speaker.)
Later experiments conducted with a large number of people have shown that, to a certain extent, audiophiles prefer the dynamical more than the statical qualities, while the average listener seems to attach more importance to the statical behaviour. Classically aligned vented boxes (flat curve down to fh, then 24dB/octave rolloff thereafter) with a 40 Hz cut off frequency also manage well with the average listener. However, audiophiles and live concert listeners often react to the sharper cut off from the statical optimisation with comments such as: "The bass is boxy, tired, undynamical, unmusical, boomy, unnatural." and so on. On the other hand the audiophiles and live concert listeners have less criticism for the larger cut off range with dynamical optimisation. Some have even said "More air, more distinct, cleaner bass reproduction" about the audibly reduced but dynamically optimised curves. The same curves that ordinary listeners call: "The bass has lost weight, sounds cold, no depth", and so on.
Until now all discussions have been about linear systems, which means those that are undistorted. In reality, however, nonlinear distortions provide a large part of the audible alteration of the reproduced sound from commercially available loudspeakers. (This will be discussed further in the text under the heading "Statically optimised closed boxes".)
Conditions chosen for the original study (1981)
To be able to test the properties of human hearing of the bass range in an objective way, it is necessary to give the person in the test a reference to compare the deviations with. If the experimental results shall be of any value, it is important that the reference is as "true" as possible when creating the conditions in audio reproduction. Therefore, the following testing conditions were demanded:
1. All listening should be take place in an echo free (anechoic) environment.
2. The reference reproduction chain should have a flat frequency down to at least 12 Hz. This includes the entire chain from microphones, via tape recorder and amplifier, to the monitor speakers.
3. The musical/sound material to be used should vary over a wide range from nearly completely statical sounds (e.g. the lowest notes from large organs), to almost completely dynamical bass sounds (e.g. symphonic bass drums, doors closed to tight small rooms). About one hour of test material was collected for the study.
4. The electronic filter box to be used for simulating the different alignments should have complete flexibility to simulate all possible alignments of both vented and closed boxes. A specification of a S/N better than 100 dB and distortion less than 0.1% over the 0-20000 Hz frequency range was also demanded. It may be stressed that these specifications are not enough for all audio purposes, but for a bass range only study the demands are lower.
(1) First the echo free environment was to be established. It is common that many actual echo free rooms have room influences as high as 200 Hz. Then things get worse the lower in frequency you go. What I needed was free field conditions down to 10 Hz, which is much lower than a standard echo free room. The only alternative was to use a a true free field! This can be found on top of a high house with a steep roof ridge or high enough up in the air to be able to disregard the ground reflection. The choice was made to use a high tree, 12 metres up. The distance between the loudspeakers and the test person was set to 0.5-1 metres, and the intensity of the ground reflection was measured to be 1/700 of the direct sound.
(2) Now the high specified reference equipment was to be composed. Of these, the microphones where the least problematical. Two measurement microphones were used, with a cut off frequency of 0.2 Hz. The second link was the microphone amplifier. It was built using discrete components and was provided with an AC-coupling of 0.1 Hz. The tape recorder that was chosen was a Technics. It could on the lowest speed reach 8 Hz after a small modification. The range 10-20 Hz was provided with a small lift. The power amplifier was DC-coupled. Finally, the loudspeakers remained to be designed. The demand for sound pressure was moderate, since the listening was to take place at a short distance. A check of the recorded material showed that the range below 25 Hz needed much less power capacity. The choice was for 6.5" bass drivers in 48 litre boxes. To get the required parameters in the easiest possible way, without the use of too much equalisation correction, the cone mass was much increased and the suspension slightly modified. With these modifications the cut of frequency was 10 Hz. The range between 10 and 20 Hz showed a reduction complemented by the rise from the tape recorder. The complete reproduction chain showed a response within +/-1dB between 11 and 160 Hz. An upper range, not of exceptionally high quality, was added in order to fulfill the desire to have a "true" reference and had a response of +/-2 to 3 dB up to 20kHz. This speaker was far from optimised for normal music listening. First, it had a flat frequency in a free field, second, the sensitivity was sensationally low (approximately 74 dB per W!) without a corresponding high power capacity (it was 100 W). For this experiment, however, the speakers where perfect.
(3) The music/sound program material was to be chosen, and recorded. The recordings where made, with few exceptions, in natural acoustical surroundings. Just about any instrument with bass output was recorded, even those with their fundamental tone in the higher bass range (e.g. kettledrums).
(4) The electronic box was fitted with four knobs representing the coordinates of the pole pairs. It also had a three-way switch to remove one pole pair or replace it with a single zero pole (aperiodic response). Using the first, closed boxes could be simulated, and with the single zero pole it was possible to simulate dipole speakers.
B. The listening tests: optimum frequency response curves
The original purpose of the tests was only to obtain limits of "allowed" frequency response deviations in the bass range. It soon was apparent that, even if it was possible to find these outer limits, it was not enough to stay within them! Well, maybe not quite so surprising, since obviously very sharp bends even within these limits may produce large group delays. Anyway, it was a surprise as to what small deviations where audible. It was necessary to establish a whole set of curves to get a useful result, and this was obtained using seven curves. When you stay on these curves, or between two of them, it is possible to make the alignments "inaudible".
Optimum bass range frequency response curves (statically & dynamically uncoloured). Allowable deviation from any curve, but not outside curves 1-7, is as follows:
50 Hz: +1.0 / -0.5 dB, 40 Hz: +1.5 / -1.0 dB, 30 Hz: +1.5 / -2.0 dB, 20 Hz: +1.5 / -5.0 dB, 15 Hz: +2.0 / -10 dB, 10 Hz: +3.0 / -15 dB, 7 Hz: +5.0 / -25 dB
C. The listening tests: semi-optimised frequency response curves
As step 2 it was to be examined if the curves could be further cut off if the variety of the test signals was reduced. It was soon noted that it was possible to divide the music/sounds into two main parts, placing completely different demands on the frequency response curve. On the one hand, the dynamical test signals (e.g. bass drums, stamping on a wooden floor, the closing of doors, pizzicato on the double bass and similar sounds) were seriously distorted by abrupt bends in the response curve. Then again, it did not matter much if the entire bass range was tilted down gently towards lower frequencies:
Dynamically optimised frequency response curves in the bass range. Note: the upper thicker curve represents the sharpest cut off that is acceptable both statically and dynamically.
On the other hand we have the statical signals (e.g. the organ, the bass tuba, etc.). With these signals it was important to keep a correct balance between the bass range as a whole (20-150) and the upper range (150-20 000 Hz). It proved to be sonically better to cut off rather sharply and equalise the remaining bass with a lift in the curve just above the cutoff frequency.
Statically optimised frequency response curves in the bass range. Note: the upper/left thicker curve represents the sharpest cut off that is acceptable both statically and dynamically.
When the cutoff frequency approaches 60 Hz a somewhat resonant and boomy quality appeared, especially on the male voice, if the remaining bass was elevated to compensate for the lost low bass.
Later experiments have showed that this problem, due to unlinear characteristics, is much worse from closed overemphasized boxes than from vented boxes, where the rise is produced by the vent. This may seem a little strange since the cutoff is much steeper from the vented design, but the explanation comes later in the text, see: "Unlinear properties of Closed boxes".
The semi optimised (not to be confused with optimum) curves with different amounts of cutoff are like the "optimised curves" presented by two set of curves, divided to the dynamically and the statically optimised curves. In both cases curve no. 7 (the sharpest cutoff that is acceptable both statically and dynamically) is plotted as reference.
Closed boxes: Optimum curves
Here I have limited the set to three curves, each of which is named after the driver resonance frequency in the box, 14, 20 and 33:
Examples of optimum frequency response curves for closed boxes.
Curve "33" is the one that is closest to optimum curve 7, and is well suited to most rooms.
Curve "20" is adapted for a room of 20 square metres. If the room is sealed and tight, the response is ruler flat down to 0 Hz!
Curve "14" is analogously adapted for a room of 40 square metres.
Both of the last two curves represent an aperiodic low-frequency closed box alignment (Qtc=0.5 and response is -6dB at resonance).
The drawback of closed boxes for these alignments is that they need four times the cone area, or a longer excursion capability (stroke), to be able to produce the same sound pressure with the same distortion as vented boxes (one 8" vented driver is approximately equal to two 12" drivers or one 15" driver mounted in a closed box).
Dynamically optimised closed boxes
Here I will present some simple mathematical rules, rather than frequency response curves. It should be sufficient to keep the system Q on 25/fo. If you would like to have a cutoff higher than 50 Hz, the Q should continue to fall below 0.50 according to the formula. For fo at lower frequencies, Q should not be higher than 0.707, and below 33 Hz the Q shall gradually approach 0.50, as in the optimum curves "20" and "14" decribed above.
Statically optimised closed boxes
Here, too, some mathematics explains it best. If you chose to cut higher than for the optimum curve 7, a sonic compensation for lost bass is needed to increase what is left of the bass range. A suitable choice of compensation is: Q=1/sqrt2*(fo/33)^0.65. Higher fo frequencies than 85 Hz are not recommended under any circumstances. Already at 85 Hz we need a compensation boost of 3 dB at 100 Hz. and here we enter what has been implied earlier:
Nonlinear properties of closed boxes
I mentioned earlier that the ear, as opposed to what you might initially believe, prefers the compensation that can be done with statically optimised vented boxes over the same compensation applied to closed boxes. There are two reasons for this.
Reason 1: The cones of closed boxes have their speed maximum occurring in the frequency range, which is lifted in amplitude. Vented boxes, on the other hand, have their speed minimum at the lifted frequencies (if the helmholtz resonance of the box has been used to create the lift). When designing for statically optimised curves, this result in a problem for the closed box - the amplitude and distortion maxima coincide for the closed box, while the amplitude maximum coincides with the distortion minimum for the vented box.
Reason 2: An elevation from a closed box exaggerates, because of thermal reasons, more the louder you play. For a vented box, however, the elevation will decrease with increased level, because of thermal reasons and losses in the vent.
D. An attempt for a "general" optimisation for vented and closed boxes
Firstly, I would like to say that this part of the report has left the objective scientific world. This is about my own personal tastes now. The following alignments are definitely audible. You can still try to balance the faults in order to find the best possible alignment, even for a less than ideal response curve, if the circumstances preclude a design according to the optimum curves. My personal experience is that a semi-optimised vented box sounds best if you start with a 4th-order Butterworth alignment, and lower the fh frequency by 20-25%. When it comes to closed boxes, I am afraid that I have found that the most classical of all the alignments, the one with Qtc=1/sqrt(2)=0.7071, gives the best compromise between dynamical and statical behaviour. Possibly still, a little lower Q value, like Q=0.65 may be preferred.
If the possibility arises, I recommend that everyone try to design genuine "optimum curve" speakers. In comparison, the semi-optimised designs (dynamically, statically or something in between), appear to be of lower sonic and musical quality for listeners with greater expectations.
Several times in the text, the so called "optimum curves" have been referred to as "inaudibly coloured" compared to a flat response. I would especially like to point out the following:
1. To achieve an unobjectionable bass response it is obvious that the system must not have any of the potential flaws. Beyond the demand to comply with the optimum curves, the system must have a distortion well below the threshold of audibility at the sound pressure levels to be used. In practice, distortion values of 10-20% are common when playing at the level of a symphony orchestra.
2. From a strict scientific point of view, the result of an experiment is only valid under the conditions that prevailed during the course of the experiment. Here it is primarily the program material I am thinking of. It is possible that some other sounds might further increase the demands on the frequency response curve. Considering the wide range of sounds used for this study, I do believe that only marginal adjustments, if any, would have to be done. Maybe curve 1 and possibly curve 2 could be dispensed with.
3. One must not forget that speakers are usually placed in rooms and not in a free field. This causes the need to consider more properties for loudspeaker design than what the curves show about hearing. Firstly, there are two room-related properties that have to be included in every serious loudspeaker project, and these are:
a. The influence of the floor reflection
Depending on the woofers' height over the floor, the contribution from the floor will be in phase with the direct radiation from the driver up to a higher or lower frequency. Some examples are shown below.
Height: 70 cm > 250 Hz (1st cancellation at approximately 500 Hz).
Height: 30 cm > 600 Hz (1st cancellation at approximately 1200 Hz).
Height: 10 cm > 1800 Hz (1st cancellation at approximately 3600 Hz).
Below this frequency the level will be elevated approximately 2-6 dB compared to higher frequencies. To be more specific, the level is elevated 6 dB when the reflection is in phase with the direct radiation and, on an average, it is elevated 3 dB in the higher range where the reflection appears in random phase.
Both the 6 dB increase at low frequencies and the 3 dB increase at higher frequencies can be less depending on how elastic the reflective surface is (for low frequencies) and how absorbing it is (for high frequencies). Therefore, the interval is 2-6 dB.
b. The cavity effect of the room
Real rooms contribute with more bass increase than if they had been a "theatre box" open in the front (this is not a "Voice of the Theatre" speaker box, but the place you sit in a theatre). In practice, there are of course many variations between different listening rooms, but it is possible to draw a "typical room curve". From it you can also see some typical floor modifications.
Comments: The influences of the standing waves in the room are not included in these curves for several reasons. Firstly, they belong to the listener's acoustic in the "theatre box". Secondly, there are no "general standing waves"; all rooms are unique. It should also be pointed out that a room-adapted speaker will have an inverted frequency response compared with these curves. You can also see that you do not have to worry about the floor reflection if the woofer is placed low to the ground and the midrange is placed high and the crossover point is placed between the floor knees of the two drivers. You should, however, see to it that the woofer has 2-4 dB less output in its free field response, without the help from the floor.
4. Another important reservation is that you should not interpret the "optimum curves" as being in any way absolute. For example, only the cutoffs which can feasibly be obtained with closed and vented boxes (and dipoles) have been simulated. It may be possible that the ear would accept steeper cutoffs than those shown. The curves shall not be seen as implying the limit of the ear to hear deviations from the response. Instead, they show the tolerance of the ear for different bass alignments which it is possible to obtain with loudspeakers.
5. You should not forget that details like windows, pictures, panels and such can begin to distort (rattle) at high levels. In this case, the response down to at least 10-15 Hz will influence the result.
In a closed room, the optimum response curves 4-7 should be chosen for the most neutral sound, with the greatest safety margin before the onset of audibly distorted sound. For listening outdoors, the optimum response curves 1-3 are valid.
G. About poles of low frequency
From the curves that have been presented, it is not evident how the human ear reacts on infrasonic poles (resonances below approximately 30 Hz). A simple summary of what Q-values can be tolerated at different frequencies follows.
It was shown very clearly that high Q values over 30 Hz were easily audible. Later experiments have shown that it is does not seem to be group delay or time delay distortion, but primarily a too high dominance of the frequencies around the pole. The fault is most apparent with transient (dynamic) signals, although it has partly statical reasons. The explanation is that the sonic influence is most pronounced when the pole as a "tone" can "peep out" from the continual spectrum of a transient.
This explanation also makes it clear why you can accept much higher Q-values from the pole at fh in a vented system (like 3-4) than from a closed box. In the first case, the pole at fh is held back by the effects from the other (fo) pole, which is a highpass filter with significant damping at fh. Closed boxes, however, have only one pole pair, and the amplitude excess around this appears completely unmasked, in spite of much lower Q-values (and lower group delay).
In crossovers of higher order, which crosses over into another range, (filters between two drivers, not filters in either end of the total range), you can quite logically accept even higher Q-values, since the accentuation of the poles from the response curve is eliminated completely when the two filter halves work together. (Thermal problems using passive filters with high Q-values should not be ignored.)
In the infrasonic range (frequencies less than 30 Hz) the ear obviously accepts much increased Q-values. This is valid both for both closed and vented boxes.
Especially with closed boxes, the distortion and demands on extreme cone size and displacement will be an appendage to high Q-values at very low frequencies.
I will not present a table of allowable Q-values because of the reasons discussed above (the audibility comes from the amplitude characteristic of high Q-values), but instead a table with allowable amplitudes at different frequencies. Do not forget that alignments according to this table are likely to produce unnecessary distortion in a practical speaker. Even if they do not contribute with any audibly timbre faults, there are good reasons to avoid them.
|30 Hz||+1.5 dB|
|25 Hz||+2.5 dB|
|20 Hz||+4.0 dB|
|16 Hz||+7.0 dB|
|13 Hz||+10.0 dB|
|10 Hz||+15.0 dB|
|7 Hz||+25.0 dB|
Translated by Per Arne Almeflo, Sonic Design, with permission from the author.