Basic Principles of Audio Processing

From Librivox wiki
Revision as of 21:26, 8 April 2010 by RuthieG (talk | contribs) (Continuation of page)
Jump to navigationJump to search

WORK IN PROGRESS - RuthieG

This is a series of short articles written by a sound engineer with many years' experience. The idea is to explain in plain language how to make a quality sound file.

Digital recording, a brief overview

Recording to a computer these days is cheap and relatively easy. In 1997 recording software cost cost me 150 Pounds. These days there are much better free and open source applications to cover all that one can do in a studio. Audacity is one of the best. I'll try to refer to Audacity as much as I can so that what I write can be tried by all those who have the inclination.

Digital and tape, what's the difference?

Tape

  • Magnetic tape records sound as a continuously variable magnetic field along the length of the tape; all of us over teenage are familiar with cassettes and possibly eight track cartridges. Magnetic tape has advantages and disadvantages against digital.
  • Tape recordings degrade over time.
  • Even the best tape has inbuilt hiss.
  • Tape has to be moved extremely accurately both in position and speed, requiring very high quality hardware. Just one part out of adjustment can ruin a recording.
  • Tape distorts the sound; admittedly it distorts the sound in a musically pleasing way such that Pink Floyd, Elton John and Kate Bush have never recorded in digital studios.
  • Tape forgives high recording levels; the nature of magnetism is that when tape is over-saturated it makes music sound nicer, so much so that in studios it is deliberately over-driven to achieve this richly pleasing effect.

Digital

Where the waves on tape are stored as field strength, digital recordings are stored as a long series of numbers, which is what computers excel at. In fact storing and shifting numbers is the only thing a computer can do, but they can do it very fast, so much so that the numbers can be coloured spots on a screen or voltages on a speaker. Once you start shifting the numbers quick enough, the pictures move and the speaker sings: video and audio.

How it works

The ever changing voltage caused by a sound vibration from microphone or mixer is presented to a tiny measuring circuit called an Analogue to Digital converter. At predetermined intervals the circuit measures the microphone voltage and assigns a number as its value. At CD quality this voltage is measured 44,000 times per second. You may have heard mention of 16 bit and 24 bit and 32 bit sound; this refers to the accuracy to which the measurements are taken. A 16 bit binary number equates to about 65,000 in old money. CD is 16 bit 44k, therefore every 44th of a millisecond a measurement is taken and stored as a number between 0 and 65,000.

To play back that recording you do quite the opposite, every 44th of a millisecond a number is taken from memory and presented to another circuit, the Digital to Analogue converter, which then produces a voltage on its output in proportion to the number. The ever changing numbers produce an ever changing voltage which drives the speaker and you've got your sound vibrations back.

Digital recordings when looked at really closely do not look like a smooth curvy wave, they look like a series of steps, but the steps are so small and the duration of them so short that you don't hear the steps, it smooths out to an uncannily accurate reproduction, which is why even the smallest cheapest MP3 player sounds much better than the best cassette player.

Simple really, when someone sensible takes all the waffle away, isn't it?

Benefits and risks of digital recording

Digital recordings are very accurate, the accuracy determined only by the quality of the DA and AD converters.

But there is a risk: if the signal going in exceeds the measuring capacity of the converter it can't possibly get a higher number than 65k or a lower one than zero. Digital does not forgive overdrive. Digital distortion will make you throw off your headphones; it is about as pleasant as the sound made by a scallywag dragging a sharp key along the side of your brand new car.

Consequently when making a recording it is imperative to see to it that the signal never reaches and crosses 0dB DFS (Digital Full Scale).

On digital equipment zero decibels is the measure of the highest level, all other values are expressed as minus decibel numbers, all the way down to minus 96dB in the case of 16 bit or CD quality.

When I record to digital tape, I record with a maximum peak value of -12 dB; in film soundtracks the value is more like -20dB. This leaves room for unexpected peaks to remain undistorted they can be compressed later, but clipping (going over the maximum level) can't be fixed easily.

In the days of vinyl records, the instruments were recorded to 24 track tape resulting in tape hiss. When it came time to mixdown, the tape signals were passed through processors and effects units and the mixing desk, picking up electronic noise along the way. The mix then went down to a stereo master tape picking up more tape hiss. The master was then taken to the pressing plant where it was passed through yet more processors in the mastering process picking up yet more electronic noise until it was cut to a master pressing disc, which was then used to press records, which ended up on your turntable. The fact that the resultant record sounded extremely clean and nice explains why a decent analogue studio costs a million pounds, but an equivalent Pro Tools set up costs fourteen thousand.

Decibels (dB) an explanation

Audio, whether floating through the air or as an electrical voltage, is measured in decibels, which is one tenth of a bel. When processing sound it is useful to understand decibels, which are not like other measurements. For a start the decibel is logarithmic, because human hearing is logarithmic.

If I give you a ten pound load to carry, you would rate it as some value of heaviness. Then if I gave you another ten pound load to carry, it would feel twice as heavy. Hearing is not like this. Ten watts of audio power playing in a room could be measured at some position as 90dB, twenty watts would register a level of 93dB at the same spot, not 180dB.

That is the rule of thumb. For every 3dB up, double the actual power; for every 3dB down divide by two. This measuring in air is called SPL (sound pressure level) and is a consistent measurement: 90dB SPL is the same volume wherever it occurs.

  • 85dB SPL is the recommended limit for long term industrial exposure without protection
  • a jackhammer is 110dB SPL
  • a jumbo jet on take off is 120dB SPL
  • Motorhead once played a gig above 120dB SPL. (Long live King Lemmy!)
  • 140 dB SPL causes instant and permanent deafness.

When talking about signal levels, decibels are used in different contexts. In an analogue channel of a mixer, dB is a relative measurement where 0dB represents the upper limit of signal strength for that channel but with some leeway above it.

In digital recordings the decibel is referred to as dB DFS (Digital Full Scale) where 0dB DFS is the absolute upper limit and all values are measured in minus values, there is no such thing as a positive dB in Digital Full Scale.

When talking about “line level” electrical signals that pass audio between equipment, 0db V (voltage) is defined as 775 millivolts. Perversely, pro equipment has its 0dB point at a line level of +4dB V and consumer equipment has a line level of -10dB V.

Confused? I am, but the important bit is the part about logarithmic hearing, just remember that part and you'll have a feel for what decibels measure.

Digital recording: How to do it from scratch

To make decent recordings for LibriVox is easy. If you know that LibriVox exists you already have the most expensive bit of kit you'll need.

All you need to add to that is:

  • Recording software (You can't go wrong with Audacity.)
  • A microphone (not expensive, if it sounds OK then it is)
  • For some mics, a small mixer to act as a pre amp and gain control for the microphone. (Look up Behringer mixers, best value in the business)
  • A pop shield

Microphones

There are condenser mics and dynamic mics.

If you have a condenser microphone it needs power: either it has a battery compartment or it doesn't. If it doesn't, you need a mixer with a 48 volt phantom power button. This sends the mic power up the signal cable.

If you have a dynamic mic you don't need phantom power. A word of warning here, with most dynamic mics, if you switch on phantom power you will instantly need a new mic.

There are three types of connection for mics:

  • USB
  • 6mm or 3.5mm Jack (unbalanced)
  • XLR (balanced)

The USB type has a flat connector that plugs into the rectangular USB port on your computer. XLR is a substantial plug with three pins. 6mm Jack is the round silver plug that you would plug into an electric guitar; 3.5 mm Jack is the small round plug that fits into the sound card on your computer.

The reason for the different types is simply that 3 core XLR is a method of getting interference in the mic cable to cancel itself out. 2 core Jack is less expensive. The USB mic is particularly useful for LibriVox purposes, as it is Plug and Play and suffers less from background noise.

Mixer

Note: it is entirely possible that you won't need a mixer. I know of at least one laptop which could record a nice clean signal through the mic socket but in my experience the mic socket in my desktop machine was filthy with noise. USB mics also don't require a mixer.

Your mixer doesn't have to mix anything, the smallest Behringer has one mic channel and costs peanuts. What it does though is give you a beautiful clean pre-amplifier, gain controls and a set of EQ controls to set the right tonal balance. No microphone has a flat frequency response and if yours does not suit your tastes you can adjust this with the EQ.

To record to the built-in sound interface on a PC, you'll need an output cable which will have two 6mm mono jacks at one end and one 3.5mm stereo jack at the other. The small jack will go into the line socket on your sound card, the 6mm monos will go into the unbalanced outputs from your mixer. You mixer may only have RCA connectors for unbalanced output, in which case get a cable to suit those. (RCA connectors are also known as phonos and are to be found on the back of CD players and on Playstations, one red, one white, the yellow on a Playstation is video.)

Pop shield

This is not essential but if you're close to the mic it will prevent popping on Bs and Ps. Say “Buh” and “Puh” to the palm of your hand and feel the blast of air. That blast is not sound it is wind and wind plays havoc with microphones, that's why TV and film recordists have a big hairy dog on a stick. The hairy dog is hollow with the mic floating in the middle. Sound can get through the hairy dogs furry coat but wind can't.

A pop shield is a disc of tenuous fabric suspended four inches in front of the mic, you can make one with a pop sock and a wire coathanger. Hey, maybe that's why they're called pop socks, they stop your mic popping.