|
Editor's Note: Welcome to "The Bitstream," Mix's new
monthly column focused on computer-based audio technologies. Penned
by noted media consultant Oliver Masciarotte, "The Bitstream"
will cover a wide variety of topics ranging from DVD, the Internet,
new computer platforms, storage and distribution issues, to networking,
software and more, all with a futuristic, yet practical slant. If
you have ideas, insights or input for future topics, contact us
at mixeditorial@intertec.com. We'll be listening.
It's no surprise to most of us that, by adopting digital acquisition,
processing and distribution, we've settled for audio that is audibly
degraded relative to its analog antecedents. Emerging acquisition
standards, such as high-resolution PCM and DSD, have brought back
the easy listening of analog to digital audio. While discriminating
engineers sweat the details during production, distributing the
finished product is another matter entirely.
Distribution, until recently, dictated degraded quality. Take
Justin the Freshman. He's quite happy listening to MP3's "near-CD
quality" just as his forefathers were quite satisfied with
their hideous 8 track tapes. Since today's distribution hot button
is the net, something's gotta go when Justins lucky to have
56 kbaud. Whether its telecom, broadcast or optical delivery,
there's usually not much bandwidth available for our audio data.
We can't all afford xDSL just yet, but we can use some discrimination
when the client asks us to "make it fit."
The search for ways to stuff audio through pitifully puny pipes
started with short-word-length PCM, ADPCM and IMA, and advanced
to the current quagmire of "standards," MPEG-1 Layer
III (MP3), RealSystem G2 and QuickTime 3 (QT3) all of which
share the ability to take PCM source files and squeeze out the
inherently redundant data, resulting in a smaller file that usually
sounds acceptable. "Sounds acceptable?" Now that's being
polite. But, all is not lost. Of late, corporate brain trusts
have been coming up with new, widely deployed codecs that actually
sound good!
Restricted carrying capacity certainly isn't new. Born in the
Analog Era, there are several bandwidthchallenged analog
standards you may recognize: NTSC and PAL television, AM/FM radio
and pre-digital Plain Ol Telephone Service (POTS). Those
standards relied on peculiarities of the human perceptual system
to deliver just enough information to convey a message over a
channel with less than full bandwidth.
Current distribution channels for digital audio, from satellite
TV to CD-ROM, are largely supporting lossy codecs for the same
reason. Some examples:
- DVD-V, along with LPCM, has optional support for
DTS, MPEG and AC-3. DVD-V was the first distribution format
with support for 96/24. An audiophile somewhere was persistent
enough to improve upon the "perfect sound forever"
of 44.1/16. Certainly a "professional" audio engineer
wouldn't have suggested such a thing. Most are perfectly happy
with the crappy audio quality that we hear every day from gear
with "pro" labels. Thank the Gods that someone in
the pro world listens to acoustic music once in a while, otherwise
DVD-A and SACD wouldn't have been proposed.
- Cinema sound is brought to you via Dolby Digital
(AC-3), Digital Theater Systems, Inc. (DTS) or Sony's 8-channel
Sony Dynamic Digital Sound (SDDS), which uses a professional
version of the same Audio Transform Acoustic Coding (ATRAC)
algorithm used by MiniDisc.
- Digital radio services, known as Digital Audio
Broadcast (DAB), use various schemes, including Lucent's Perceptual
Audio Coder (PAC), ISO/MPEG Layer II, MPEG-4 or Musicam (Masking
Pattern Universal Sub-band Integrated Coding Multiplexing)
the codec of choice for good ol ISDN phone patches.
- North American digital television uses
you guessed it AC-3. No wonder Dolby Labs has been busy
hiring in its licensing division.
- The hairball of standards that is the Web has
its own collection. Microsoft's Windows Media format (WMA),
the standard for that company's extensive Active Streaming Format
(ASF), and Version 2 of the QDesign codec (QDMC) used in the
fabulous, open source QuickTime 4 are examples of audio mechanisms
for the most popular computer operating systems. The SDMI distribution
system for secure music purchases supports WMA, MP3, MPEG-2
AAC and Lucent's Enhanced PAC (ePAC). Liquid Audio's open, multiformat
approach doesn't play OS favorites. It supports AAC, AC-3 and
MP3.
All of the codecs mentioned above are classed as perceptual coders.
Though invented in the late 1980s at Bell Laboratories, these
algorithms have continued to morph just as Bell Labs has morphed
into Lucent after the breakup. Engineers at Dolby, QDesign, Lucent
and the Fraunhofer Institute for Integrated Circuits (FgH) among
others, have thrown every trick in their multidisciplinary book
at the development of modern versions. The algorithms these companies
developed, as in the analog days of yore, provide bit rate reduction
without compromising perceived fidelity. The sidebar shows some
features of the low-data-rate king, QDMC v2, and the high-rate
winner, ePAC, along with AAC. ePAC, is a highly refined dark horse
with the power of Lucent behind it. MP3 is included for reference
only as its old-school performance is considerably poorer than
the other three.
| Codec |
Stereo or
Multichan. |
Sample &
Bit Rate |
Comments* |
|
|
|
|
| AAC |
multichannel |
96 k SR, 8-128
kbps/ch |
royalties |
| ePAC |
multichannel |
44.1 k SR, 8-256
kbps/ch |
HTTP & Real
G2 streaming, IIPP |
| QDMC v2 |
multichannel |
48 k SR, 8-128
kbps/ch |
>HTTP &
RTSP streaming, IIPP |
| WMA |
stereo |
48 k SR, 5-160
kbps/ch |
HTTP & MMS
streaming, IIPP |
| MP3 |
stereo |
? SR, 32-320
kbps/ch |
royalties |
* (Comments) - IIPP stands for integrated intellectual property
protection mechanisms such as watermarking and encryption. Also,
WMA is not a codec but is a framework for several specialized
codecs.
How do perceptual encoders work? Let's look at AAC, as its not
cloaked in secrecy like QDMC and ePAC. Both ePAC and MPEG-2 Advanced
Audio Coding incorporate algorithms from Lucent's PAC, dating
from 1992. AAC is one of the codecs of the MPEG-2 standard and
is a subset of MPEG-4, the new unified family of ISO standards
for delivery of rich media. Major contributors included the FgH,
AT&T, Dolby and Sony.
As with other offerings, AAC employs perceptual subband/transform
coding, whereby the signal is first transformed from the time
domain into the frequency domain using a variable window or block
length. The encoder then applies a psychoacoustic model to estimate
whether, in any particular band of frequencies, the signal strength
is above or below the perceptual threshold relative to the adjacent
bands. If the signal is above the masking threshold, a spectral
coefficient or value is generated to represent the signal in that
band. Masking threshold means the amplitude threshold below which
a spectral component will be "hidden" or masked by louder
components at frequencies nearby. Its a brain thing, just go with
it. Once all the valid coefficients are determined, AAC applies
additional mechanisms to enhance coding efficiency, including:
joint coding, which removes monaural redundancy in a stereo signal;
temporal noise shaping prediction, which distributes quantization
noise over time; and lossless Huffman coding.
You can be sure that lossy codecs won't be going away anytime
soon, based on the proliferation of terrestrial (and satellite)
distribution of rich media, the emergence of solid-state personal
stereos and record labelslarge and smallbetting on
a hybrid revenue model of onsite advertising with the enticement
of free track downloads. So, when a project comes knocking that
allows your input, do what you're paid for. Use your ears first,
then make an educated choice from the new codec menu. To help
you get a feel for which one is appropriate, I've posted sample
files on my site for your evaluation. Stop by www.seneschal.net
and follow the Papers & Articles link to take a listen (see
below).
Though I've written for Mix as far back as 1987, "The Bitstream"
is my first column for this magazine. I'll try to provoke some
thought about the technological foundation for our industry
and
answer questions you may or should have. The conflux of audio
and computers forms the basis for discussion and there's
plenty out there to cover. I'll also, on occasion, wander
into other
subject areas while eluding my editors…
Bio Oliver Masciarotte is an audio craftsmensch and new-media
consultant. He recently returned from Japan where he worked to
bring up a new, integrated DVD-A/V installation for Prime Mix/Tokyo
and the HMC.
Take a listen, then you decide
Related Links - Players & Servers
Players & Broadcasters -
The MP3Car
site, despite the name, has a good listing of commercial and
home-brew mobile and fixed hardware players and a crappy listing
of player software
Servers
Icecast
has a streaming audio server for Intel, being developed under
the GNU open source umbrella...
Liquid
Audio started the whole secure download trend. These days,
theyre long in the tooth and short of viable offerings.
Microsoft has it's Windows
Media Player. Its cross-platform but their server is anything
but.
Nullsoft's SHOUTcast
Server is available for the following platforms: Windows 9x/NT/2000,
FreeBSD, Linux, and Solaris.
For open source audio-related projects, such as Darwin servers
and LAME encoders, see Seneschals Linux,
Open Source & Unix Audio links.
Links - Music Downloads & Streams
Angry
Coffee is a "technical information resource for novice
to expert users who want to learn about and develop audio on
the Internet."
Bearshare,
an enhanced Gnutella file sharing tool for Win.
Gnutella,
a true peer-to-peer network, lives on.
imesh,
another enhanced P2P application for Win.
Listen.com
has multiformat music for download.
Mouse Jam
has QuickTime-enhanced interactive music.
MP3.com has,
guess what? Beware, they will spam you to distraction!
MP3'
Tech has information about the MP3 standard and upcoming
audio compression techniques, tests, MPEG source codes, etc.
Napster,
now in a coma, combined a chat client with an MP3 player, tossing
in other features to make an interesting product. Interesting
enough to clog campus networks and send Intellectual Property
zealots into a tizzy...
Links - Various
Flavor
Software provides engineering services for building MPEG
1 through 4 products and services
Replace the forpay CDDB with the open source freedb€works for most wellcoded players and rippers The ISO's Motion Picture Experts Group home page has a comprehensive FAQ and other resources. Music Enforcer checks "all servers on all services for illegal material - giving you enough identifying information to either submit a ban on the users from the service, or to hold them responsible in a court of law." The Xiphophorus company is a distributed group of Free and Open Source programmers working on OggSquish, a group of related multimedia and signal processing projects. | |