TCS - Challenging MP3 Files with a White-noise Spectrum

Challenging MP3 Files with a White-noise Spectrum

By Jerry Haberkost and Gordon Waverly, PCUGSJ
PC Users Group of South Jersey
From the January 2003 issue of the I/O Port Newsletter

Discovered recently that GoldWave, that superb audio editor shareware (goldwave.com) includes an audio spectrum analyzer facility. An audio spectrum analyzer is something neither of us could ever afford for our home labs. Like kids with a new toy, we had to find something that put it to use and test its mettle.

The MP3 file compression format intrigues us because it achieves such great compression ratios with essentially no loss in apparent audio playback quality. It struck us that the spectrum analyzer could tell us some things about the MP3 algorithm.

We decided to use a computer-generated, white-noise signal as the challenging audio energy recorded in a WAV file. We then converted this WAV file to 32 individual MP3 files using GoldWave software. Comparing each of these converted files with the source file, using the spectrum analyzer, gave us a pretty good idea of what information the MP3 algorithm discards in the interest of file compression.

What Is White Noise?

A white-noise signal is used widely in audio circles as a test providing a full spectrum of audio energy. Converted into sound, a white-noise spectrum sounds much like the noise that a TV set makes when it’s tuned to an unoccupied channel. It is also often compared to what sunlight is to visible light. This infinitely-random characteristic makes it quite useful in striking comparisons between audio systems.

Since we needed only a short sample of the white-noise signal, we recorded a 30-second burst. Using a short file provides enough material for analysis and reduces the time involved in each file conversion to about 45 seconds.

The Reference File

First, some detail about the WAV reference file: Recorded at one-half maximum level (to avoid clipping) at CD-quality (44.1kHz sampling; 16-bit depth) resulted in the spectrogram shown below. Compare this with the others of this article.

Note the uniform distribution of average energy from 0 to 22kHz. The resulting 30-second file uses 2.7 megabytes of storage. The dips are characteristic of a white-noise spectrum and they are entirely transient in nature, averaging out over time. In real life, the spectrogram is in constant motion, around an average level that remains a constant amplitude. The “snapshot” characteristic of the illustration cannot convey this motion. 

The Smallest MP3 File

First, we converted the huge WAV file to the parameters that results in the smallest MP3 file that GoldWave software affords: A 16kHz  (kilohertz) sampling rate at 8kB/s (kilobytes per second) bitrate. The resulting file uses only 30 kilobytes of storage, trading off most of the passband and about half of the amplitude of the original. This is the spectrogram for that file:

Note that the conversion cut the 22kHz of the signal to a passband of less than 1kHz with the entire bandwidth limited to 8kHz. Although the spectrogram doesn’t show it because its vertical scale is logarithmic, the average level of the signal is down about 50 percent or, 6dB below the original level. The playback sound we heard couldn’t possibly compare with the original under these terms, sounding much like the squawky sound a conch seashell makes when you put it to your ear.

Next, we converted the WAV file to an MP3 file with the next set of parameters that GoldWave offers: 16kHz at twice the bitrate: 16kB/s. It resulted in a file of 61kB: Precisely double that of the smaller one. This is the spectrogram for that file:

Note that doubling the bitrate nearly quadrupled the passband to almost 4kHz and, in addition, increased the amplitude to nearly that of the original.

Our third conversion, at 16kHz at 24kB/s, resulted in a file size of 91kB.

Note that the passband now exceeds 5kHz at about the same amplitude as the 16kB/s spectrogram above.

For the fourth conversion, we increased the bitrate to 32kB/s which increased the file size to 122kB and expanded the passband to nearly 7kHz with zero loss in amplitude as compared to the original signal. Here’s the spectrogram of that file:

The relationship between passband and bitrate became quite apparent at this point when we increased the bitrate to 40kB/s and the passband filled out to the 8kHz bandwidth limit imposed by the 16kHz sampling rate. The file size ballooned to 152kB. Bitrates beyond 40kB/s (48, 56, 64, 80 and 96) made no improvement in the passband but the storage requirements increased accordingly. At 96kB/s the storage requirement went to 365kB for a 30-second file. So, the indication is that if you use a 16kHz sampling rate with an 8kHz bandwidth, you’re wasting storage space if the bitrate exceeds 40kB/s.

Then, we increased the sampling rate to 22,050Hz at 8kB/s (30kB file) and something strange took place: When we played the MP3 file in GoldWave and WinAmp, the sound became a random series of midi-like chords instead of the “rush” of a noise spectrum.

When we wrote to Chris Craig, the developer of GoldWave about this phenomenon, he wrote back: “That is pretty typical when using very low bitrates [with white-noise]. There just is not enough room [in the algorithm] to store the broad white-noise spectrum at that bitrate.”

When we increased the bitrate to 16kB/s (61kB file), the resulting MP3 file playback came out as a noise spectrum limited to about 3kHz in much the same manner as the earlier conversions. The passband increased proportionately with each increase in bitrate until the 11kHz bandwidth limitation kicked in at 56kB/s with a 213kB file size. It follows then, that a bitrate greater the 56kB/s with a 22,050Hz sampling rate merely uses more storage space without any improvement in passband. (See chart below.)

Increasing the sampling rate to 24kHz with a 40kB/s bitrate (152kB file), the falloff occurred at about 9kHz. Using a bitrate beyond 64kB/s (243kB file) adds no more passband beyond the sampling rate limitation as shown in the chart.

With a 32kHz sampling rate the lowest bitrate available is 64kB/s with a 243kB filesize. This limits the passband at a point just above 13kHz. Increasing the bitrate beyond 80kB/s (304kB file) adds no more passband beyond 16kHz.

Using a 44.1kHz sampling rate and a 32kB/s bitrate (122kB file), the cutoff frequency runs about 6kHz. Increasing the bitrate beyond 112kB/s (425kB file) only increases storage space.

To summarize, here’s the data in tabular form:

Sample

Rate (Hz)

Bitrate

(kB/s

Bandwidth

(Hz)

Passband

(Hz)

File Size (30 sec)

16,000

8

8,000

1,000

30kB

16,000

16

8,000

4,000

61kB

16,000

24

8,000

6,000

91kB

16,000

32

8,000

7,000

122kB

16,000

40

8,000

8,000

152kB

16,000

48

8,000

8,000

182kB

16,000

56

8,000

8,000

213kB

16,000

64

8,000

8,000

243kB

16,000

80

8,000

8,000

304kB

16,000

96

8,000

8,000

365kB

 

 

 

 

 

22,050

8

11,100

See Text

30kB

22,050

16

11,100

3,000

61kB

22,050

24

11,100

6,000

91kB

22,050

32

11,100

9,000

122kB

22,050

56

11,100

11,100

213kB

22,050

40

11,100

11,100

151kB

22,050

48

11,100

11,100

182kB

22,050

56

11,100

11,100

213kB

 

 

 

 

 

24,000

40

12,000

9,000

151kB

24,000

48

12,000

10,000

182kB

24,000

56

12,000

11,000

213kB

24,000

64

12,000

12,000

243kB

 

 

 

 

 

32,000

64

16,000

13,000

243kB

32,000

80

16,000

16,000

304kB

 

 

 

 

 

44,100

32

22,050

6,000

122kB

44,100

40

22,050

10,000

151kB

44,100

48

22,050

12,000

182kB

44,100

56

22,050

14,000

213kB

44,100

64

22,050

16,000

243kB

44,100

80

22,050

19,000

304kB

44,100

96

22,050

20,000

365kB

44,100

112

22,050

22,050

425kB

 

 

 

 

 

44,100

WAV

22,050

22,050

 2676kB

The last line in the table lists the data for the WAV file which we converted 32 times into MP3 files for the experiment. Note that the tabulation includes much more data than referenced in the text. Note also that the 44k/112kB/s MP3 file delivered the same passband as the WAV file with a 6:1 compression ratio.

Note also that there’s no data (in the chart) for sampling rates above 44,100 at 112kB/s although GoldWave provides six more MP3 conversion levels (128; 160; 192; 224; 256 and 320kB/s) at 44k and 48k sampling rates. Chances are that these parameters increase file size with little improvement in passband.

In the interest of print space, we chose not to include spectrograms for the files in this article with sampling rates above 16kHz because, for all practical purposes, they appear redundant and virtually identical with those shown here except for the differences in bandwidth imposed by the change in sampling rate. The curve on each had essentially the same characteristics as those included here.

In conclusion, bear in mind that a white-noise spectrum is very difficult to compress, because it lacks the pitch and amplitude changes of program material. As a result, it really challenges the file-compression system. While it points up the passband limitations, it doesn’t take into account the psychoacoustic features of the encoder when it removes the redundant information that we don’t perceive in order to reduce the file size.

Now, we’re curious to see what MP3 compression does to a complex musical recording.

Your comments are always welcome at tsokrebah@juno.com or lyrevaw@juno.com.



For more information on the Tulsa Computer Society click here




Tulsa Computer Society 1/02/2003
Don Singleton, President