Pro Audio Files

Your Ears Are Colorblind, and Other Analogies for MP3 Compression

If your social media newsfeed is anything like mine, then it is regularly filled with videos, articles, and memes related to audio, intermixed with common viral content.

There have been several posts floating around recently that caught my attention. Some have been explicitly related to audio, while others were not. But, they got me thinking about audio things anyway.

Visual Analogies for MP3 Compression

To begin with, there is a popular meme that shows a picture of Scarlett Johansson in various stages of make-up and Photoshop editing. This picture is meant to provide a visual analogy to the several stages of music production – from mixing to mastering. At the bottom is a pixelated version of the “Mastering” photo. This degraded version represents “MP3 compression.”

Around the same time I saw the picture of Scarlett Johansson, I also saw a different post of a TED Talk on MP3 compression.

The speaker in the video discusses the concept of audio compression by using a similar visual analogy: Pablo Picasso’s Guernica. Audio compression is demonstrated by shrinking the size of the painting to miniscule dimensions.

Besides the visual picture, the speaker provides an explanation of his perspective, implying that MP3 compression removes the energy and soul of a musical recording.

Beyond just removing intangible elements of a recording, more specifically it is implied from the visual analogy that MP3 compression uniformly removes data from a recording.

The speaker goes on to explain how we perceive the missing data, “psychoacoustics helps you restore what has been blanked out.”

A similar claim and analogy is provided by Andrew Scheps in this video.

In this video, MP3 data compression is illustrated using a text example where specific letters are removed. The claim is that our brain can still interpret the meaning of the words even with the missing characters because our “brain fills in missing information.”

Are These Analogies Appropriate?

However, all of these analogies do not properly explain MP3 compression. These claims are over-simplistic and insufficient, even to the point of being misleading and incorrect.

MP3 compression does not uniformly remove information or uniformly degrade quality from a song (with the exception of the very lowest bit-rates that uniformly remove high frequencies).

MP3 compression does not remove information that our brain can obviously notice and fill in.

Rather, MP3 compression selectively removes information that our brain would never receive anyway.

MP3 compression works because the cochlea in our auditory system has limitations, not because our brain fills in the gaps.

Cochlear Mechanics

When sound enters our ears, it passes through several parts of your auditory system, including the cochlea, before sound ever reaches our brain.

When sound vibrations pass through our cochlea, the vibrations travel through fluid. Lower frequencies travel further through the fluid, and therefore travel further through the cochlea. Higher frequencies are absorbed sooner in fluid, and do not travel as far through the cochlea.

Hair cell detectors located along the cochlea are associated with various frequencies, from low to high. Because lower frequencies travel further through the cochlea, there is the potential for loud low frequencies to stimulate the hair cell detectors for low frequencies, as well as the hair cell detectors for high frequencies.

Therefore, loud lower frequencies have the potential to mask quiet higher frequencies. This masking means the brain will not receive the additional information from the higher frequencies because it is lost due to the lower frequency information.

The conceptual question then becomes, “if there is information in an audio signal that will never reach the brain for analysis, then is the extra information worth sending through the auditory system to be lost?”

Your Ears are Colorblind

In my opinion, a better visual analogy for MP3 compression is to consider color blindness. Typically, people who are colorblind can see some colors, but cannot perceive a difference between certain colors.

Color blindness does not occur because a person’s brain fails to interpret different colors.

Color blindness occurs if light-sensitive cells in the retina fail to respond appropriately to variations in wavelengths of light that enable people to see separate colors.

Consider what would happen for someone who is colorblind if an image contains colors that that cannot be differentiated.

There is a viral post I recently saw about Bob Ross painting a picture for a colorblind fan.

In this example, the painter chose to limit himself to only using colors that could be perceived by the viewer. Colors are intentionally excluded that cannot be perceived.

MP3 compression is very similar. It is a process to limit the information in an audio signal to only what can be perceived by the listener.

Conclusion

If you want a visual analogy for MP3 compression, it is a beautiful painting created specifically for your perceptual system.

Missing our best stuff?

Sign up to be the first to learn about new tutorials, sales, giveaways and more.

We will never spam you. Unsubscribe at any time. Powered by ConvertKit
Eric Tarr

Eric Tarr

Eric Tarr is a musician, audio engineer, and producer based in Columbus, Ohio. Currently a Professor of Audio Engineering Technology at Belmont University in Nashville, TN.
Smiley face
  • Joseph Sannicandro

    Please check out Jonathan Sterne’s MP3 the Meaning of a Format, a cultural study of the 90 year history of audio compression

  • Migari

    I think there are some major caveats here

    First of all, there is an assumption that low frequencies are always there. What will it sound like if the bass gives out, isn’t there to begin with or is suboptimal in any way to keep up the masking effect it potentially has?

    With all masking done will the music be believable in every detail if you amplify it to PA levels? How can you know that the lossy compression won’t cause playback artefacts noticable at this loudness that the lossless original won’t display?

    Also, how can you be sure how the subconcious experience plays out like in an audience listening for many hours, if there’s no or little research into that? It’s my experience in music that it is the small nuances that matter. Whatever some may say, quality in content as well as in performance works.

    Another aspect is, what if the consumer is a DJ modulating the song in a DJ program, changing the pitch as well as stretching the timing and filtering it and mixing it with other sounds. Will a lossy file work under that circumstance?

    I also disagree that we know everything how hearing works or how people experience sound, which i take as implied here. There are too many assumptions about the level of knowledge out there.

  • Justin C.

    Very apt analogy Eric!

    I have a similar analogy you might enjoy:

    I like to think of a good audio compression codec as a bit like a film camera capturing 24 frames per second.

    Sure, we’re “throwing away” some data. just like a film camera can’t take a snapshot of every single one of the (infinite) subdivisions of time.

    Yet, try as we might, we just can’t perceive 24 frames shown to us in a second as anything other than a moving image.

    No amount of practice or appreciation will allow our eyes and brains to see it as a series of still images.

    Much is the same with a good audio codec.