Basic Principles of Audio Processing

From Librivox wiki
Jump to navigationJump to search

This is a series of short articles written by a sound engineer with many years' experience. The idea is to explain in plain language how to make a quality sound file.

Digital recording, a brief overview

Recording to a computer these days is cheap and relatively easy. In 1997 recording software cost cost me 150 Pounds. These days there are much better free and open source applications to cover all that one can do in a studio. Audacity is one of the best. I'll try to refer to Audacity as much as I can so that what I write can be tried by all those who have the inclination.

Digital and tape, what's the difference?

Tape

  • Magnetic tape records sound as a continuously variable magnetic field along the length of the tape; all of us over teenage are familiar with cassettes and possibly eight track cartridges. Magnetic tape has advantages and disadvantages against digital.
  • Tape recordings degrade over time.
  • Even the best tape has inbuilt hiss.
  • Tape has to be moved extremely accurately both in position and speed, requiring very high quality hardware. Just one part out of adjustment can ruin a recording.
  • Tape distorts the sound; admittedly it distorts the sound in a musically pleasing way such that Pink Floyd, Elton John and Kate Bush have never recorded in digital studios.
  • Tape forgives high recording levels; the nature of magnetism is that when tape is over-saturated it makes music sound nicer, so much so that in studios it is deliberately over-driven to achieve this richly pleasing effect.

Digital

Where the waves on tape are stored as field strength, digital recordings are stored as a long series of numbers, which is what computers excel at. In fact storing and shifting numbers is the only thing a computer can do, but they can do it very fast, so much so that the numbers can be coloured spots on a screen or voltages on a speaker. Once you start shifting the numbers quick enough, the pictures move and the speaker sings: video and audio.

How it works

The ever changing voltage caused by a sound vibration from microphone or mixer is presented to a tiny measuring circuit called an Analogue to Digital converter. At predetermined intervals the circuit measures the microphone voltage and assigns a number as its value. At CD quality this voltage is measured 44,000 times per second. You may have heard mention of 16 bit and 24 bit and 32 bit sound; this refers to the accuracy to which the measurements are taken. A 16 bit binary number equates to about 65,000 in old money. CD is 16 bit 44k, therefore every 44th of a millisecond a measurement is taken and stored as a number between 0 and 65,000.

To play back that recording you do quite the opposite, every 44th of a millisecond a number is taken from memory and presented to another circuit, the Digital to Analogue converter, which then produces a voltage on its output in proportion to the number. The ever changing numbers produce an ever changing voltage which drives the speaker and you've got your sound vibrations back.

Digital recordings when looked at really closely do not look like a smooth curvy wave, they look like a series of steps, but the steps are so small and the duration of them so short that you don't hear the steps, it smooths out to an uncannily accurate reproduction, which is why even the smallest cheapest MP3 player sounds much better than the best cassette player.

Simple really, when someone sensible takes all the waffle away, isn't it?

Benefits and Risks of Digital Recording

Digital recordings are very accurate, the accuracy determined only by the quality of the DA and AD converters.

But there is a risk: if the signal going in exceeds the measuring capacity of the converter it can't possibly get a higher number than 65k or a lower one than zero. Digital does not forgive overdrive. Digital distortion will make you throw off your headphones; it is about as pleasant as the sound made by a scallywag dragging a sharp key along the side of your brand new car.

Consequently when making a recording it is imperative to see to it that the signal never reaches and crosses 0dB DFS (Digital Full Scale).

On digital equipment zero decibels is the measure of the highest level, all other values are expressed as minus decibel numbers, all the way down to minus 96dB in the case of 16 bit or CD quality.

When I record to digital tape, I record with a maximum peak value of -12 dB; in film soundtracks the value is more like -20dB. This leaves room for unexpected peaks to remain undistorted they can be compressed later, but clipping (going over the maximum level) can't be fixed easily.

In the days of vinyl records, the instruments were recorded to 24 track tape resulting in tape hiss. When it came time to mixdown, the tape signals were passed through processors and effects units and the mixing desk, picking up electronic noise along the way. The mix then went down to a stereo master tape picking up more tape hiss. The master was then taken to the pressing plant where it was passed through yet more processors in the mastering process picking up yet more electronic noise until it was cut to a master pressing disc, which was then used to press records, which ended up on your turntable. The fact that the resultant record sounded extremely clean and nice explains why a decent analogue studio costs a million pounds, but an equivalent Pro Tools set up costs fourteen thousand.

Decibels (dB) an explanation

Audio, whether floating through the air or as an electrical voltage, is measured in decibels, which is one tenth of a bel. When processing sound it is useful to understand decibels, which are not like other measurements. For a start the decibel is logarithmic, because human hearing is logarithmic.

If I give you a ten pound load to carry, you would rate it as some value of heaviness. Then if I gave you another ten pound load to carry, it would feel twice as heavy. Hearing is not like this. Ten watts of audio power playing in a room could be measured at some position as 90dB, twenty watts would register a level of 93dB at the same spot, not 180dB.

That is the rule of thumb. For every 3dB up, double the actual power; for every 3dB down divide by two. This measuring in air is called SPL (sound pressure level) and is a consistent measurement: 90dB SPL is the same volume wherever it occurs.

  • 85dB SPL is the recommended limit for long term industrial exposure without protection
  • a jackhammer is 110dB SPL
  • a jumbo jet on take off is 120dB SPL
  • Motorhead once played a gig above 120dB SPL. (Long live King Lemmy!)
  • 140 dB SPL causes instant and permanent deafness.

When talking about signal levels, decibels are used in different contexts. In an analogue channel of a mixer, dB is a relative measurement where 0dB represents the upper limit of signal strength for that channel but with some leeway above it.

In digital recordings the decibel is referred to as dB DFS (Digital Full Scale) where 0dB DFS is the absolute upper limit and all values are measured in minus values, there is no such thing as a positive dB in Digital Full Scale.

When talking about “line level” electrical signals that pass audio between equipment, 0db V (voltage) is defined as 775 millivolts. Perversely, pro equipment has its 0dB point at a line level of +4dB V and consumer equipment has a line level of -10dB V.

Confused? I am, but the important bit is the part about logarithmic hearing, just remember that part and you'll have a feel for what decibels measure.