Digital Audio: Aliasing Explained
The world of digital audio is opening up to many consumers as PC’s become more powerful at lower costs. Programs like Max/MSP (with Max for Live also available) are allowing anyone with the slightest interest in digital audio to really get down to manipulating sound at a sample level. With my colleague Matthew Weiss’ article on the basics of digital audio, I thought I’d provide a more detailed explanation of Aliasing, something that is important to understand when dealing with any digital audio.
Eight Bits is a Byte, Four Bits is a Nibble
As stated in Matthew’s article, all digital audio is mathematically represented; you don’t need to understand binary to appreciate that the audio being digitized is being turned into a number for the computer. The Analog-Digital Converter (ADC) is essentially receiving a graph. It may help to think of it as a camera taking pictures rapidly, and each picture showing a tiny portion of the wave ready to be reconstructed at a later date. A video camera may take 25 pictures a second, but an example for audio would be 44100 pictures (samples) a second; this is your sampling rate. The information the sample actually contains is the voltage of the wave at the point the samples taken, a little like a co-ordinate for a map. So for a wave going up you may 0.001v followed by the next sample with 0.002v. All of the samples are then put together to create the digital representation of the waves we’re used to seeing in our DAW’s. Aliasing is caused by the sample rate not being high enough t accurately recreate the wave.
How to Lose friends and Alias People
Imagine you have been presented with the values of all the samples taken from a wave you know nothing about, and are asked to use some graph paper to draw out a single cycle of the wave, using those samples. With a nice high sample rate you would hope to see something like the picture below. A lot of samples would allow you to accurately reproduce the wave. It would look (or more accurately sound) very much like the original analogue wave.
If you read up on Aliasing you will no doubt come across something called the Nyquist Theorem or Frequency. This is the very minimum sampling rate you can use to accurately recreate the sound. With a point at the Peak and Trough of the wave you could just about recreate the correct wave (though you would need to interpolate to get the curved edges and not a triangle wave). The Nyquist Frequency is 1/2 of the sampling frequency, and represents the highest frequency which can be represented in the digital audio data.
Effects of Aliasing
If the worst occurs, and a signal of more than half the sampling frequency is sampled, then a frequency shift occurs. Once again we’ll take 40 KHz as a sampling frequency. If 30 KHz were to be sampled there would not be the necessary 2 samples per wave and the resulting sampled frequency would be a triangle wave of lower frequency (a good example of this can be found here with wave C). If in this example a 60 KHz wave was sampled, then a mirroring effect (about the Nyquist frequency) occurs. So with 60 KHz being 20 KHz above the Nyquist, the resulting wave would be 20 KHz (40 – 20). As we’re dealing with waves outside of our hearing range you can never be certain just what’s actually in the room around you when you’re recording; if this were to happen though it would easily ruin a good take.
As none of us notice mysterious harmonies and frequencies appearing in our digital audio, it’s safe to assume aliasing isn’t too much of an issue for us to worry about. In some ways this is true, certainly when it comes to digital recording. Any digital recorder will have an ‘Anti-Aliasing Filter’. This is pretty much self-explanatory. CD audio has a sampling rate of 44.1 KHz which means that up to 22.05 KHz of audio is safe to be sampled. In this case it’s likely that a filter would be present to start rolling off all audio from 20 KHz onward, cutting everything off by 22 KHz. This way no unwanted audio should make it to the converter. At this point you can get into a debate about inaudible sounds and whether we hear them in different ways, which is where 96 KHz or higher sampling rates could become useful as theoretically they would allow audio to be sampled higher. As to whether manufacturers build their AA Filters to increase with sampling rate, I don’t know. 96 KHz would however ensure that even up to 20 KHz would have at least 4+ samples per cycle, which would technically increase the quality, (but remember it does increase your CPU workload, and increase your file sizes, as a result).
As I mentioned in the introduction there are programs (like Max/MSP) that grant us a much more intimate access to individual samples in a file, and as such it’s worth understanding exactly what aliasing is.