The MP3 file compression format intrigues us because it achieves such great compression ratios with essentially no loss in apparent audio playback quality. It struck us that the spectrum analyzer could tell us some things about the MP3 algorithm.
We decided to use a computer-generated, white-noise signal as the challenging audio energy recorded in a WAV file. We then converted this WAV file to 32 individual MP3 files using GoldWave software. Comparing each of these converted files with the source file, using the spectrum analyzer, gave us a pretty good idea of what information the MP3 algorithm discards in the interest of file compression.
A white-noise signal is used widely in audio circles as a test providing a full spectrum of audio energy. Converted into sound, a white-noise spectrum sounds much like the noise that a TV set makes when it’s tuned to an unoccupied channel. It is also often compared to what sunlight is to visible light. This infinitely-random characteristic makes it quite useful in striking comparisons between audio systems.
Since we needed only a short sample of the white-noise signal, we recorded a 30-second burst. Using a short file provides enough material for analysis and reduces the time involved in each file conversion to about 45 seconds.
First, some detail about the WAV reference file: Recorded at one-half maximum level (to avoid clipping) at CD-quality (44.1kHz sampling; 16-bit depth) resulted in the spectrogram shown below. Compare this with the others of this article.

Note the uniform distribution of average energy from 0 to 22kHz. The resulting 30-second file uses 2.7 megabytes of storage. The dips are characteristic of a white-noise spectrum and they are entirely transient in nature, averaging out over time. In real life, the spectrogram is in constant motion, around an average level that remains a constant amplitude. The “snapshot” characteristic of the illustration cannot convey this motion.

Note that the conversion cut the 22kHz of the signal to a passband of less than 1kHz with the entire bandwidth limited to 8kHz. Although the spectrogram doesn’t show it because its vertical scale is logarithmic, the average level of the signal is down about 50 percent or, 6dB below the original level. The playback sound we heard couldn’t possibly compare with the original under these terms, sounding much like the squawky sound a conch seashell makes when you put it to your ear.

Next, we converted the WAV file to an MP3 file with the next set of parameters that GoldWave offers: 16kHz at twice the bitrate: 16kB/s. It resulted in a file of 61kB: Precisely double that of the smaller one. This is the spectrogram for that file:
Note that doubling the bitrate nearly quadrupled the passband to almost 4kHz and, in addition, increased the amplitude to nearly that of the original.
Our third conversion, at 16kHz at 24kB/s, resulted in a file size of 91kB.
Note that the passband now exceeds 5kHz at about the same amplitude as the 16kB/s spectrogram above.


For the fourth conversion, we increased the bitrate to 32kB/s which increased the file size to 122kB and expanded the passband to nearly 7kHz with zero loss in amplitude as compared to the original signal. Here’s the spectrogram of that file:
The relationship between passband and bitrate became quite apparent at this point when we increased the bitrate to 40kB/s and the passband filled out to the 8kHz bandwidth limit imposed by the 16kHz sampling rate. The file size ballooned to 152kB. Bitrates beyond 40kB/s (48, 56, 64, 80 and 96) made no improvement in the passband but the storage requirements increased accordingly. At 96kB/s the storage requirement went to 365kB for a 30-second file. So, the indication is that if you use a 16kHz sampling rate with an 8kHz bandwidth, you’re wasting storage space if the bitrate exceeds 40kB/s.
Then, we increased the sampling rate to 22,050Hz at 8kB/s (30kB file) and something strange took place: When we played the MP3 file in GoldWave and WinAmp, the sound became a random series of midi-like chords instead of the “rush” of a noise spectrum.
When we wrote to Chris Craig, the developer of GoldWave about this phenomenon, he wrote back: “That is pretty typical when using very low bitrates [with white-noise]. There just is not enough room [in the algorithm] to store the broad white-noise spectrum at that bitrate.”
When we increased the bitrate to 16kB/s (61kB file), the resulting MP3 file playback came out as a noise spectrum limited to about 3kHz in much the same manner as the earlier conversions. The passband increased proportionately with each increase in bitrate until the 11kHz bandwidth limitation kicked in at 56kB/s with a 213kB file size. It follows then, that a bitrate greater the 56kB/s with a 22,050Hz sampling rate merely uses more storage space without any improvement in passband. (See chart below.)
Increasing the sampling rate to 24kHz with a 40kB/s bitrate (152kB file), the falloff occurred at about 9kHz. Using a bitrate beyond 64kB/s (243kB file) adds no more passband beyond the sampling rate limitation as shown in the chart.
With a 32kHz sampling rate the lowest bitrate available is 64kB/s with a 243kB filesize. This limits the passband at a point just above 13kHz. Increasing the bitrate beyond 80kB/s (304kB file) adds no more passband beyond 16kHz.
Using a 44.1kHz sampling rate and a 32kB/s bitrate (122kB file), the cutoff frequency runs about 6kHz. Increasing the bitrate beyond 112kB/s (425kB file) only increases storage space.
To summarize, here’s the data in tabular form:
|
Sample Rate (Hz) |
Bitrate (kB/s |
Bandwidth (Hz) |
Passband (Hz) |
File Size (30 sec) |
|
16,000 |
8 |
8,000 |
1,000 |
30kB |
|
16,000 |
16 |
8,000 |
4,000 |
61kB |
|
16,000 |
24 |
8,000 |
6,000 |
91kB |
|
16,000 |
32 |
8,000 |
7,000 |
122kB |
|
16,000 |
40 |
8,000 |
8,000 |
152kB |
|
16,000 |
48 |
8,000 |
8,000 |
182kB |
|
16,000 |
56 |
8,000 |
8,000 |
213kB |
|
16,000 |
64 |
8,000 |
8,000 |
243kB |
|
16,000 |
80 |
8,000 |
8,000 |
304kB |
|
16,000 |
96 |
8,000 |
8,000 |
365kB |
|
|
|
|
|
|
|
22,050 |
8 |
11,100 |
See Text |
30kB |
|
22,050 |
16 |
11,100 |
3,000 |
61kB |
|
22,050 |
24 |
11,100 |
6,000 |
91kB |
|
22,050 |
32 |
11,100 |
9,000 |
122kB |
|
22,050 |
56 |
11,100 |
11,100 |
213kB |
|
22,050 |
40 |
11,100 |
11,100 |
151kB |
|
22,050 |
48 |
11,100 |
11,100 |
182kB |
|
22,050 |
56 |
11,100 |
11,100 |
213kB |
|
|
|
|
|
|
|
24,000 |
40 |
12,000 |
9,000 |
151kB |
|
24,000 |
48 |
12,000 |
10,000 |
182kB |
|
24,000 |
56 |
12,000 |
11,000 |
213kB |
|
24,000 |
64 |
12,000 |
12,000 |
243kB |
|
|
|
|
|
|
|
32,000 |
64 |
16,000 |
13,000 |
243kB |
|
32,000 |
80 |
16,000 |
16,000 |
304kB |
|
|
|
|
|
|
|
44,100 |
32 |
22,050 |
6,000 |
122kB |
|
44,100 |
40 |
22,050 |
10,000 |
151kB |
|
44,100 |
48 |
22,050 |
12,000 |
182kB |
|
44,100 |
56 |
22,050 |
14,000 |
213kB |
|
44,100 |
64 |
22,050 |
16,000 |
243kB |
|
44,100 |
80 |
22,050 |
19,000 |
304kB |
|
44,100 |
96 |
22,050 |
20,000 |
365kB |
|
44,100 |
112 |
22,050 |
22,050 |
425kB |
|
|
|
|
|
|
|
44,100 |
WAV |
22,050 |
22,050 |
2676kB |
The last line in the table lists the data for the WAV file which we converted 32 times into MP3 files for the experiment. Note that the tabulation includes much more data than referenced in the text. Note also that the 44k/112kB/s MP3 file delivered the same passband as the WAV file with a 6:1 compression ratio.
Note also that there’s no data (in the chart) for sampling rates above 44,100 at 112kB/s although GoldWave provides six more MP3 conversion levels (128; 160; 192; 224; 256 and 320kB/s) at 44k and 48k sampling rates. Chances are that these parameters increase file size with little improvement in passband.
In the interest of print space, we chose not to include spectrograms for the files in this article with sampling rates above 16kHz because, for all practical purposes, they appear redundant and virtually identical with those shown here except for the differences in bandwidth imposed by the change in sampling rate. The curve on each had essentially the same characteristics as those included here.
In conclusion, bear in mind that a white-noise spectrum is very difficult to compress, because it lacks the pitch and amplitude changes of program material. As a result, it really challenges the file-compression system. While it points up the passband limitations, it doesn’t take into account the psychoacoustic features of the encoder when it removes the redundant information that we don’t perceive in order to reduce the file size.
Now, we’re curious to see what MP3 compression does to a complex musical recording.
Your comments are always welcome at tsokrebah@juno.com or lyrevaw@juno.com.
For more information on the Tulsa Computer Society click here