Stay in touch…

Blog

Read the latest Bitstream

RSS Feed

LinkedIn

Look for us at LinkedIn

Twitter

Follow us on Twitter

Mix Magazine

An unedited manuscript for an article published in the July 1998 Mix Magazine.

DVD-Audio Introduction

True high fidelity audio, 24 bits at 192 kHz, is finally within the reach of both professionals and the average consumer. This article discusses some of the features and issues regarding DVD-A.

OK, let's get one thing straight. DVD, the savior of bandwidth and capacity–hungry media folk everywhere, has a problem. “Yeah, yeah, I know. Competing standards, titans of manufacturing battling for supremacy, all that stuff.” Well, that's not the real problem. The standards will shortly sort themselves out and the hardware and content providers will capitulate when they see their precious shekels evaporating from lack of consumer acceptance. No, those diversions pale in comparison to the real problem. The name…‘DVD.’ There's the rub. Does every TLA (three letter acronym) have to mean something? Quite a while ago, the parties involved decided to go with just the letters, no deep meaning attached. So, what does DVD stand for? Nothing, damn it!

Sorry, I had to get that off my pale and underdeveloped chest. As for audio developments relating to DVD, there's been quite a flurry of flailing limbs and firing synapse. The audio standard is sooo close to ratification, I can smell it. Walk with me, taking a deep whiff, and let's just see what this format, er, is made of…

The final draft from the DVD Digital Audio Working Group 4 (WG-4) for DVD-Audio was hammered out by the 40 principal and more than 60 voting members, significantly more that the 10 companies represented in the original DVD group. The WG-4's finished “Part 4” document will a superset of Part 3, the existing DVD-Video specification. Though anything defined in the draft proposal may change in the final document, it's more likely that we'll see something quite like what follows.

That draft proposal calls for scalable multichannel audio encoded as linear PCM at equal or higher sample rates than are currently available. Whereas DVD-V specifies that 2 channels of 96 kHz sampled audio is permissible, DVD-A allows for up to 6 channels. With DVD-V, the audio community was given a standard distribution medium for recording at 96 kHz sample rates. If only there was a standard for recording…It may come as a surprise to find that, for DVD-A, stereo audio is supported at not only at 44.1 and 88.2 kHz but also the 4x rates of 176.4 and 192 kHz. The utility of these ultrahigh rates is questionable, but you can't fault the standards boys for not thinking ahead.

The DVD-A family can be thought of as having two branches; let's call them A-Discs and AV-Discs. A-Discs, as the name implies, contain only audio. AV-Discs go one step further and optionally include a subset of DVD-V for stills, real-time text, full screen video and other MPEG eye candy. A DVD-V disc, where the picture is king, must reserve room for the video objects or files. DVD-Audio considers motion pictures, along with other visual content, to be an optional enhancement to a chiefly aural experience. The underlying reason for dividing the family has to do with the structure of DVD. While DVD-V has a Video Manager to oversee the audio and video content, DVD-A has an additional Audio Manager to control the additional audio data. So, if a disc has video titles for the Video Manager to handle, it's an AV-Disc. DVD-V will retain one visual advantage over DVD-A; the interesting, if little used, multiple viewing angle feature is not part of the AV-Disc's repertoire.

For either disc type, here's the run down on allowable audio data types that can live in the 9.6 Megabit/second stream;

Channel
Group 1
Channel
Group 2
Channels,
6 max. unless otherwise noted
1 to 4 0 to 3
Sample Rate (kHz) 44.1 44.1
48 48
88.2 88.2 or 44.1
96 96 or 48
176.4 (2 ch. max.)
192 (2 ch. max.)
Word Length (# of bits) 16 16
20 16 or 20
24 16, 20 or 24

 

As with a good ol’ CD, linear PCM is mandatory. But here, it's a mix and match scheme that allows the producer to allocate the sample rate and word length according to taste. Notice the inclusion of ‘channel groups,’ a method of signal partitioning that can be thought of as Group 1/front and Group 2/rear. More on that later.

So how much program can you fit on an A-Discs? Here's another table with some examples for a DVD-5;

48 kHz, 20 bit, 2 channel
88.2 kHz, 24 bit, 2 channel
88.2 kHz, 24 bits, 3 channels (LCR) + 44.1, 20 bit, 2 channel surrounds

 

Of course, you'd get about double these values on a dual layer DVD-9. A dark secret of the dual layer technology is increased jitter over the single layer approach. Jitter averages are dependent on the material used to sputter the semitransparent layer, with gold beating silicon, silicon nitride or silicon carbide for best jitter performance. Silicon, however, is both less expensive and performs better in the bonding step, which is the final hurdle to cost effective DVD-9 and DVD-18 production.

The option of including stills, text, menus and motion isn't limited to the discs themselves. Players will also be made available with a variety of features, from inexpensive, stripped down versions that just make analog audio to fancy versions with component video and digital audio outputs. A universal variety will play both. Already, January CES in Lost Wages saw the introduction from high end hi-fi manufacturers of new, second generation DVD-V players specified as having low jitter digital audio outputs. Look for more good sounding players in the consumer channels this fall.

For those of you who've been keeping up with DVD, you know that ‘authoring’ is part of the creation process. This is the production stage where the title's level of interactivity and the role of both the Audio and Video Manager is defined. Authoring tools will give the programmer the ability to detail the behavior of a disc in a wide range of players from simple Walkman style commands with no need for a visual menu, to more complex behaviors such as constructing different, seemingly random music compositions each time a disc is played. As a player's memory registers can be programmed to keep track of a user's “walk” through a title, the disc could alter its presentation based on each user's selection order. A simple example would be authoring an offering so that, if a user skips an opening menu and immediately starts playing music, then interactivity is assumed to be undesirable and the disc reverts to simple linear play. If however, the user initially starts the title by exploring the opening menu, then deeper levels of complexity such as optional visuals become accessible.

Another aspect of authoring is defining how a given player “deals” with a limited number of playback channels. An explicit ‘downmixing’ procedure, along with the channel group assignment, is stored in the header of each track. This smart content allows the author to control individual channel amplitude and phase, though not dynamically. Downmix is accomplished by applying an individual scaling coefficient to each track for left and right, then summing the separate left and right aggregate tracks to produce a final stereo mix. By manipulating the scaling coefficients and channel group assigns, the author could suppress or accentuate certain tracks to interactively hide or reveal particular material. Since there's room for 16 individual mixdown tables within each title set, there are lots of creative possibilities.

One area of keen interest for the engineering community is the inclusion in the draft spec of optional formats such as DTS, Dolby Digital, lossless compression, MPEG-2 BC and DSD or Direct Stream Digital. Accommodation of these disparate formats is yet to be determined by the WG-4, but its here that a good portion of the battle for Joe Consumer's hard earned dollars will be fought. All of the participants have a need to see their technology receive the official blessings. DTS has done quite well with their technology, positioning it as a high fidelity alternative to the seemingly excessive bandwidth required by multichannel PCM. Dolby Digital, with it’s maximum data rate limited to less than half that of DTS, is not what one would think of as ultrafidelity. It does work great in the arenas in which it was designed; motion pictures, DVD-V and DTV, where it must coexist with bandwidth hungry motion images. As for MPEG-2 audio, especially the difficult Backward Compatible multichannel implementation, there seems to be little hope of it's widespread use with the official announcement that MPEG audio is not required for PAL formatted DVD-V titles and PCM or Dolby Digital will work just fine, thank you. It seems that MPEG-2 BC encoder technology has to play some catch up as well.

Sony, as the only company in the mix to combine technology development with a major music label, holds a fascinating hand with DSD. For those professionals spending too much time making a living to have heard about such things, DSD uses an oversampled delta sigma modulated A-to-D converter to generate a 2.8224 MHz 1 bit signal, a rate chosen as a simple multiple of the lowest common high fidelity PCM sampling rate, 44.1 kHz. Sony and Philips NV have banded together yet again to promulgate the Super Audio CD, a competing dual layer, hybrid DVD technology using a Red Book compliant base with the second, semitransparent layer containing DSD data. Not to take all this sitting down, Dolby Labs is reported to be negotiating with hi-fi manufacturer Meridian to sub-license a lossless system invented by British engineers Peter Craven and the late, great Michael Gerzon. There are several other provisions being considered for lossless compression, including one referred to as ‘bit shifting.’ Rather than encoding an entire data word, only bit cells with data are stored and null data in removed. So, if only 19 bits of the 24 bit AES payload contain data, then only those bits are carried along.

Looking at production tools, there are several ways of recording the higher sample rates but only industry leaders SADiE and Sonic Solutions (now Sonic Studio LLC - Ed.) have full editing systems capable of handling 88.2/96. At present, only Sonic has a complete production solution that can not only handle 176.4/192 and simultaneous dual sampling rates but can also perform authoring and real-time 5.1 Dolby Digital encode/decoding chores to boot.

The inclusion of 88.2 in the spec. means that masters recorded at that rate can be converted to 44.1 distribution with simple decimation rather than either the two step 96 to 88.2/88.2 to 44.1 or ultracomplex single step conversions that 96 k requires. High definition pioneers Pacific Microsonics have long promoted 88.2 as the best origination sample rate and now hope to see it used for 96 k release as well. Says Michael Ritter, founder and VP, “There are a number of high precision tools we'll be releasing for both our Model One (converter) and as plugins for Sonic. We're really excited that HDCD technology will have a positive impact on the new WG-4 release format.”

Prior to finalization of the 1.0 spec, a separate DVD committee will come to terms with the record label's hot button of the week; some form of intellectual property protection. Anti-piracy proposals currently floating around range from sublime to ridiculous. But, the resulting copy management scheme hopefully will be more robust than the regional restrictions built into Part 3 to prevent motion picture titles from flooding foreign markets prior to local theatrical release. Already, for the cost of a few DVD titles, current DVD-V players can be hot rodded by savvy overseas technicians to bypass the built-in regionalization lockouts. Regional codes, along with parental blocks, won't be included in the DVD-A spec. What to do about adult DVD-A titles?

Not to sit contentedly munching Belgian bonbons, the Working Group is beginning the task of building Part 5 “VAN” spec, a bridge format between DVD-V and DVD-A. VAN discs, while a member of the DVD-V family, will be playable on universal DVD-A players if so authored. What the details shape up to be is still a mystery but, like the (bring up the reverb) inexorable march of time, I'm sure we'll be ready for it!