Storyteller's Recording Guide
By LibriVox member: Gregg Margarite
Many competent LibriVox readers have little experience telling a story to a microphone. They just want to read, but the recording part keeps getting in the way. When they try to learn about recording they’re faced with a dizzying number of unfamiliar subjects. What they require is a simple explanation of only the stuff they need to make good LibriVox recordings. This might be it...
This guide doesn’t provide step-by-step instructions in the use of any particular piece of hardware or software. It provides a basic understanding of concepts and techniques used in recording for LibriVox. It will help you figure out how to get started by giving you a rough understanding of what you’re doing. When it comes to recording everyone is in a slightly different setting. The only way to find out what works best for you, is for you to understand a little about recording techniques.
LibriVox recordings are MP3 files. This format shrinks the size of audio files which makes things like iPods and LibriVox practical. But in doing so the fidelity of the recording is diminished. This reduction in quality is actually a good thing for LibriVox readers because poor recording techniques are often inaudible in low-resolution files like MP3s. This means your recording skills only need to reach the level of a good MP3 file. And that’s what this guide will try to help you do; make passable MP3 recordings of a single speaker. Do not use this advice to make high fidelity multi-track recordings.
Reading Volume and Acoustics
The first thing to do is decide if you are going to read loudly or softly. This determines your distance from the microphone, which in turn determines your acoustic needs.
If you perform impressions of different character voices it’s likely some of your voices require projection to accomplish. Softer impressions like foreign accents, the venerable old man, the street punk, and the fast talking detective can be done in any setting, but the gruff city editor, the gum-chewing hooker, and the sinister carnival barker often need to be done loudly. This means you have to get further from the microphone when you read to prevent distortion in your recordings.
The further you get from the mike the more your room geometry comes into play. Room geometry describes the acoustic effects caused by the shape and contents of a room on sound waves. The further your mouth is from the microphone the more opportunity that mike has to pick up copies of your spoken words after they’ve bounced off a wall or object. These reverberations are responsible for the narrator-in-a-box sound that most readers want to avoid. If your reading style demands distance from the mike then you need to reduce the effects of the room in which you record. If you are this type of reader please see Modifying Room Geometry below.
On the other hand if you can read softly and get close to the microphone you can largely ignore room geometry. This is because you’re speaking so softly there isn’t enough energy in your voice to create audible reflections. Now it doesn’t matter what kind of box the narrator inhabits because you can’t hear the box anymore.
Storytelling can be one of the most intimate activities of life. Unlike telling a story to an audience, recorded literature doesn’t require readers to trade subtlety for projection. Each of us has an un-projected voice of intimacy. That’s the voice you use at 3 AM lying next to someone in bed. It’s the voice you use to talk to yourself in public. And it’s the voice announcer’s use to make movie trailers. You speak without pushing, and the resonant frequencies of your natural voice come out. That’s the voice you want to record.
The problem with recording your natural voice is that it’s very, very quiet. Luckily you plan to be on top of the microphone, but there is one problem. Most folks aren’t used to speaking without projecting and some phonemes require percussive (P’s Popping) or sibilant (Hiss) sounds. These and other air-driven noises will distort your recording when you’re that close to the mike so you’re going to need to practice a little. It doesn’t take much; it’s mostly learning to control and direct breathing.
Put your hand close to your lips and read aloud. You will be able to feel the air-driven sounds and work on minimizing them. You can also try to record from an indirect angle, or cover the mike with windscreens or pop filters to break up air-driven sounds, but these won’t solve all phoneme issues.
Once you’re comfortable using your natural voice make a test recording then play it back through an equalizer (tone controller) that lets you manipulate the frequencies of your voice. Find your most resonant frequencies and accentuate them until your voice is as warm as possible. Remember these frequencies and their settings. The profile you develop here will be applied to future recordings. This is your “sound”.
Modifying Room Geometry
If your reading style requires you to have any real distance between your mouth and the microphone then you’re going to be able to hear reflections of your voice as it bounces off the walls and objects in your recording room. This is great if you’re a doo-wop band in a tiled men’s room, but it’s not so great for LibriVox recordings. So minimizing unwanted reverberations is the goal.
When you speak you disturb the air by generating waves of sound. These waves work much the same way as ripples in a pond. They can be broken up or reflected by objects in the water, or they can be echoed back to the source by the banks of the pond. The shape and contents of the pond causes waves to be absorbed and reflected at different rates and strengths. Sometimes waves cancel each other out, other times they reinforce one another and become stronger. Throwing a pebble into a pond gets complicated fast.
However, there are some general methods for reducing reverberations that don’t require exacting calculation of your room geometry. The first is to introduce materials that absorb the energy of the sound waves. The second is diffusion which alters the angle of reflections.
We’ll discuss absorption first. Every substance in your recording room has different coefficients for absorption, reflection, and transmission. A pillow does more absorbing than reflecting. A glass window does more reflecting than absorbing. You probably already have a general feel for the absorption/reflection properties of common materials. Most folks know rugs and curtains make a room quieter. They also know that tiles make rooms sound brighter and more reverberant.
So look around your recording room and identify reflective surfaces. Try covering those surfaces with blankets or towels. If windows or doors can be opened without introducing background noise then open them. An opening to the outside should produce no reflections. If you can’t open the window get heavy curtains.
If you have a large open bookshelf, face it when you record. Books and the airspace between them and the wall make good absorbers. Try to get some absorbing material on the wall behind you as well. You can use open-cell foam, felt, carpet, draperies, cellulose, fibrous mineral wool, porous ceiling tiles, fiberglass insulation, etc.
Many recording studios try to eliminate corners because they reinforce certain reflections. If possible fill your corners with absorbing materials as well. Also keep your airspaces as voluminous as possible. For example, if you’re going to hang a curtain along a wall, three inches of airspace between the curtain and the wall is better than two inches of airspace.
Now let’s discuss diffusion. When a sound wave bounces off a surface the shape of the surface contributes to the angle of reflection. Flat glass for example usually sends the reflection straight back to the source, while the spray-on popcorn often used on ceilings creates a surface that scatters reflections so they interfere with one another and lose motive force.
Many people have seen garage bands that put egg cartons on the wall in an effort to diffuse the sound. The convoluted structure of egg cartons is good for this purpose, but most egg cartons are made of poor acoustic materials and present more of a fire hazard than a diffusion surface. But you get the idea. Anything you can do to break up and misdirect the sound waves will help.
Parallel walls tend to reinforce sound waves. While you probably can’t rebuild your recording room walls, you can hang or lean things against them and arrange furniture in a way that minimizes the parallel aspect. The less it resembles a box the harder it will be to hear the box in your recordings.
Although it’s unlikely, it’s possible for you to be too successful. If you eliminate enough reflections your room will sound lifeless. Dead rooms often make performers uncomfortable so don’t build an anechoic chamber. Leave enough reflection to approach a kind of sonic balance. If necessary make test recordings and fine-tune your treatments accordingly.
We have barely glanced on the subject of acoustics but this might just be enough to make your LibriVox recordings sound the way you want. The most important issue neglected here is that of frequency. This refers to the shape of the sound waves over time. A high pitched sound produces many waves whose crests are close together. A low pitched sound produces fewer waves whose crests are further apart.
Absorbing material can be good for one frequency but not another. And reflectors can bounce some frequencies better than others. However, given the goals and circumstances it probably isn’t necessary for you to address this level of complexity for your LibriVox recordings.
There are many things you can do to your voice to make it sound different. There are catalogs full of black boxes and software plug-ins that apply all kinds of effects to your recordings. However most folks find the less processing they do the higher the quality of their recordings, so be very judicious about signal processing.
Some signal processors fix one thing but create problems elsewhere. This causes a second processor to be applied and so on until their interactions turn your recording into sonic mud. Use only those processors that serve your purpose as a storyteller. Don’t process out the intimacy. And don’t let fooling with the gear keep you from recording.
There’s a signal processor for just about every circumstance you can think of; preamps capture more character, compressors turn you up when you’re quiet, limiters turn you down when you’re loud, noise gates shut out background sounds, de-essers turn down the frequencies that make your “S” sounds hiss, microphone modelers try to make your cheap mike sound like an expensive one, and noise suppressers try to remove among other things, the noise created by using processors.
These processors can be very helpful but the only one you generally want to use is equalization (EQ). In the Reading Volume and Acoustics section of this guide you analyzed your voice to find the EQ settings that most compliment your intonation. You want to apply those EQ settings to your recordings.
Normally high quality recordings are captured without processing so that processing decisions can be made after the fact. But the needs of LibriVox recordings are simplistic enough to allow consideration of otherwise un-recommended practices. You are only recording one track of one voice that has consistent total settings. This means you can record a pre-processed signal without too much risk.
At the Microphone
If we assume most LibriVoxers just want to record books, we can also assume they don’t have professional resources or budgets. If we concentrate on only those things critical to good recordings surely one of the most important is the ability to hear your voice as you read. And better still to hear your voice after it has been processed so you know what’s ending up on the recording.
If you’ve tried to plug a microphone and headphones into a PC sound card then you’ve probably discovered latency. This refers to the time it takes for your voice to be recorded and sent back to your headphones. Latency causes a delay in your headphones that makes it difficult to read since you can be three or four words ahead of what’s in your ears. If you have the right hardware you can solve the latency problem but many LibriVoxers do not.
One of the less expensive ways of solving this problem is to purchase a Mixer. This is a device used to gather and modify many audio inputs, although you will only be using one input for your voice. You plug the microphone into the mixer, and the mixer into the PC sound card. Mixers usually contain separate bass, treble, and midrange controls for setting EQ, and you can use the headphone jack in the mixer to hear your voice after EQ had been applied but before it has had a chance to be delayed by latency. An entry level mixer with EQ should cost around $50 to $60 US. You will find them at musical supply web sites.
There are also a myriad of other outboard audio input devices built to use USB or FireWire interfaces, and they have varying features depending on your pocketbook. Many of these devices will help with latency issues, and some have EQ and other effects built-in. Or they give you the ability to record without processing, but hear a processed signal as you read. If you are interested you should research these devices on musical supply web sites as well.
As you record learn to “work” the mike. This involves moving in when you get quieter and backing off when you get loud. You must know how to control your breathing, and use the naturally resonant qualities of your voice to draw the listener in. One of the first guys to learn to work a mike was Bing Crosby. His sound is warm and inviting, yours can be as well… in its own way.
Many readers find it useful to use combination mike/headphone devices. Usually an adequate condenser microphone is attached to a short flexible boom mounted on one of the earpieces. This keeps the microphone at a constant distance from your mouth and frees you to turn your head without affecting the recording. These combination devices cost around $50 to $60 US.
At the Computer
There are dozens of software programs that record digital audio. Many LibriVoxers use a free program called Audacity; however many of your computers probably came with some form of recording software.
When selecting software for LibriVox recording you want to keep your eye out for two features. The first feature is the ability to perform “non-destructive editing”. This means the software allows you to manage many sound clips along a continuum. And each clip’s properties can be changed without destroying the original clip.
For example, you need to record the following two sentences:
“Call me Ishmael. Some years ago--never mind how long precisely—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world.”
You start your recorder and say:
“Call me Ishmael. Some years ago--never mind how long precisely—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would snail a boot…oh…damnit! [Exhalation noise]”
If your software has non-destructive editing you will be able to grab the edge of the recorded clip and shorten it to exclude: “I thought I would snail a boot…oh…damnit! [Exhalation noise]”
You set yourself at the new ending of the clip and turn on your recorder:
“I thought I would sail about a little and see the watery part of the world.”
Now you have the ability to manipulate the edges of both clips. You can also move the clips closer together or further apart so the edit between them sounds natural.
The second feature you want to look for in recording software is the ability to use “plug-ins”. These are little programs that work in conjunction with your recording software to process signals. If after you record you want to apply a compressor or de-esser to the recording you will need to plug-in these components.
You don’t need either of these features to successfully make LibriVox recordings but it often makes the process easier and increases quality.
Some programs come with a few processors included. The most common is called Normalization. When you normalize a recording you find the loudest part, determine how much it can be turned up without distorting, and turn up the entire recording that amount. This overall boost maintains the dynamics between quiet sections and loud sections. If you want to turn up quiet sections and leave loud sections alone you will need a compressor.
Another common problem LibriVoxers encounter is electrical current noise. This usually is a humming sound underneath your recording. This is because the electricity in your house has a frequency which is being picked up as sound by your recording equipment. If you can, try reversing the polarity of your devices (flip the electrical plug over), or turn off other devices on the same circuit like fluorescent lights and refrigerators. If all else fails you can purchase a power conditioner to negate the hum.
If you have the drive space, consider recording stereo .wav files at CD Quality (44.1 KHz – 16 bits – 1411.2 Kbps). This will allow some editing and processors to be more precise. When you’re done you can export the recording to a smaller MP3 file which you upload to Libivox.
When you make a digital recording two of the parameters that determine the quality of the recording are the sample rate and the bit rate. LibriVox recordings use a sample rate of 44.1 kilohertz and a bit rate of 128 thousand bits per second. You don’t need to know a lot about this to make proper LibriVox recordings other than to set your recording software or device to record using a 44.1 KHz sample rate and a 128 Kbps bit rate, however brief descriptions of these parameters are provided to increase your overall understanding.
Digital recorders work by making many very short recordings of your voice. These short recordings are called samples. When you make a recording with a sample rate of 44.1 KHz you are making forty four thousand one hundred distinct recordings of your voice every second. Like frames in a movie the recordings play back so fast you can’t perceive the individual nature of each frame or sample.
The higher the sample rate the better the quality of the recording. For example, if you record at 8 KHz (8,000 samples per second) you will sound as if you’re on the telephone. 44.1 KHz is the sample rate used for audio CDs as well as LibriVox recordings.
Every sample you record needs to be transferred to memory, and stored. The bit rate parameter determines how many bits (zeros and ones) will be transferred per second. This also affects quality, but more importantly it affects the size of the file. LibriVox recordings convert your 44,100 samples into not more than 128,000 bits transferred per second. At 128 Kbps the size of your file will not exceed 960,000 bytes for each minute of the recording, or about a megabyte a minute. For comparison purposes an audio CD has a bit rate of 1,411 Kbps or about 10,584,000 bytes for each minute of the recording.
Most folks would like to have CD quality recordings, but they don’t want them badly enough to wait over ten times as long to download them. Using CD quality bit rates means downloading “The Adventures of Tom Sawyer” from LibriVox would result in a 4,302 megabyte download, instead of the 390 megabytes it is at 128 Kbps, or the 195 megabytes at 64 Kbps. It’s a trade off, but it’s a good one.
There is a third parameter you might encounter called bit depth. This determines how many bits will be used to describe each sample. LibriVox recordings use 16 bits. This is the same as audio CD quality.
Finally, make sure the MP3 file you post is monaural. Mono recordings take up less space than stereo recordings. Since most LibriVox files contain a single voice mono is sufficient, and the smaller file size further speeds the download process.
The author of this guide has received a number of questions from LibriVoxers regarding recording techniques, and also reading style. The answers to those questions were compiled into this guide.
The topic of recording techniques is somewhat objective with widely accepted ideas of right and wrong. On the other hand the topic of reading style is somewhat subjective and doesn’t deal with right or wrong, it deals with good and bad. This is not really the author’s turf, but enough questions have been posed that covering this thorny issue seems appropriate.
You have your own set of unique reading tools. Discovering and improving the use of those tools is your goal. Suggestions from other readers can be helpful, but they can also slow you down and create doubt. Let nothing impede your progress, including your inhibitions. All reading advice should be taken with a grain of salt, and your tongue firmly planted in your cheek.
For example, I sometimes select stories by author having no idea of the content. These stories are recorded cold so the reader and the listener find out what’s happening at the same time. Experience has shown this is bad advice to many readers. What works for one, doesn’t always work for another. That having been said…
Be conversational - Imagine this is your story and you’re telling it to a friend. Perhaps you’re on the corner with your peeps, or maybe you’re in transit talking to a seatmate. Sell the story. Use inflections to make points. Be personally invested. Own it, and drive it home as if these events actually happened to you. They’re not your words, but it’s your interpretation that makes them work.
Play it broad - In-person conversation includes body language and subtle intonations your microphone can’t capture. You need to make up for this with an exaggerated performance. You have to be larger-than-life when you record in order for the finished product to sound merely life-sized. But don’t take it too far. Chewing the scenery and lighting fire to the drapes will make you sound like Jerry Lewis. Back off until you’re just below a William Shatner.
Simulate thought - Unscripted storytellers have rhythms influenced by indecision. They pause to formulate sentences. They stammer when groping for words. They feign surprise badly, and so on. Hesitations, stumbles and sprung rhythms make storytelling sound realistic. What’s not there is almost as important as what is. Leave holes of thought in appropriate places.
Speed follows action – If the action in the story speeds up, so should you. Don’t read a fight scene at your normal gait. Imagine you’re on the radio describing a live event like a prize fight or ball game. Announcers must keep up with the action or it passes them by. Listeners expect a blow-by-blow to be more chaotic. Speed adds tension. If they’re with you, increasing the speed will excite them. But like all this reading advice, don’t overdo it, you’re not narrating The Adventures of Rocky and Bullwinkle.
Don’t fake it – Look up the pronunciation and definition of words you don’t know. Learn a little about the author of your piece, and the time in which she/he/it lived. Understand the general use and purpose of tools employed by characters whether that be cotton-gin or warp drive. Appreciate context. Consider the symbolism, the formula, and the goals of the piece. In other words, do your homework. To sound authentic be authentic.
You are what you is – Maybe you have an accent, maybe you say po-tate-oh and I say po-tot-oh. Maybe your “TH” sounds lisp a little, or maybe the tenor of your voice sounds thin to you. Do not assume these things are handicaps. If your storytelling is interesting enough none of that stuff will matter. Use what you have to your advantage. Don’t try to hide anything.
Have fun – Relax. If you’re not having fun when you read listeners will hear it in your delivery. Besides, what are you doing this for anyway?
You record a story, play it back, and decide it could be better. The previous sentence is true of all readers regardless of skill. There will always be room for improvement in your recordings no matter how many takes you do. You should strive for perfection but realize it can never be achieved. Learn when to let go.
This could be construed as permission to post sub-standard recordings. It is not.
It assumes a LibriVoxer willing to slog through this guide leans towards, shall we say, Type A behavior. Just make sure your level of professionalism ranks below your conception of perfection. Recordings that meet your level of professionalism should be posted, even though they’re not perfect. No one will point and laugh, and if they do… @*#% them!
Epilogue - Denial of Omniscience
The subjects mentioned here are presented in a way that simplifies some very complex concepts. There is an assumption the average reader of this guide doesn’t have a particular interest in acoustics or recording techniques beyond the desire to make good recordings for LibriVox. To make this guide easily digestible to those people some subjects have been abridged. In other words, corners have been cut, surfaces have been skimmed, and issues have been pared down, dodged, fudged, or entirely ignored.