- AUSTIN, TX-Cold War-era nuclear submarines may seem an unlikely source of inspiration for a conference telephone design. But with the assistance of some declassified documents, plus Metric Halo's SpectraFoo analysis software, LifeSize Communications has set a new benchmark in sound quality with a phone that combines a microphone array with beam-forming technology to bring high-definition audio to the boardroom.
Research and development engineer, Wil Oxford used Metric Halo's SpectraFoo to test and measure the high-definition audio for LifeSize Communications' revolutionary VoIP conference phone.
When LifeSize co-founders Craig Malloy and Michael Kenoyer invited Wil Oxford to join the research and development team, he knew little about telephony. "I came from a higher-end audio background," said Oxford, who left Apple Computer and started working on his own three years ago. "At Apple Computer, I used Foo to help me improve the audio quality on the slot-loading iMac design as well as the microphones used by Apple's award-winning speech recognition software. I had access to the world's most sophisticated audio tools at Apple and I kept coming back to Foo."
Oxford began designing the new phone with LifeSize, intrigued by the idea of creating the next innovation in telephony. "As I got further and further into the design of the phoneI used SpectraFoo every day to measure the performance of the system-such as THD, frequency response, signal-to-noise ratio, and dynamic range," he said. The team ended up with the most superior conference phone to date. "From a quality perspective we're head and shoulders above everybody else and Foo makes a big difference. People say SpectraFoo is a professional tool. Well, we're doing professional audio on this phone. I've used all the traditional tools of the trade that are out there, but I just keep coming back to Foo because it's just so easy for me to use."
Designed primarily for VoIP (voice over internet protocol) operation, the LifeSize phone incorporates a circular array of 16 microphones around the periphery of an 11-inch diameter disk and offers a 16kHz frequency response with negligible distortion over an IP connection. "All of the mics are omnis and they're on all the time. We take the signal from those and we create virtual beams with digital signal processing that can sweep around the room.
We can track eight different targets in a room at the same time. It's the same technology that's used in towed passive sonar arrays for a nuclear submarine," explained Oxford. A typical Los Angeles class sub has an onboard mainframe to process the data from the hydrophone arrays in real-time for threat detection and navigation. "Under the hood we use over a gigaflop of processing to do the things that we need to do," said Oxford of the LifeSize phone. A gigaflop is one billion floating-point operations per second.
Oxford said that he had a breakthrough about a month into the design process while digging through the University of Texas engineering library stacks. "I realized there were some interesting techniques being used in the radar signal processing world that were equally applicable. That allowed us to drop our processing requirements by a factor of ten. At that point I knew we were onto something. That was the genesis of our circular array. I realized that it was a good mic pod to add onto a hi-def videoconferencing system, but it would also be a very good standalone speakerphone. The market for conference phones is easily ten times that of the market for a high-end videoconference system."
Oxford, who is also a recording engineer, has even put the "CD-quality" phone to perhaps the ultimate test. "We put one of our phones about 25 feet away from an orchestra and a choir and recorded a CD of the John Rutter 'Requiem.' We put the phone down in the middle of the hall and formed beams on all the interesting things, such as the soprano solo, the harp, and the organ. We recorded them as separate tracks, then a friend of mine mixed them down."
Most VoIP systems are capable of at least 8kHz bandwidth, which is quite a bit different from the standard 4kHz telephone bandwidth. The Lifesize phone has 16kHz bandwidth. "We found that, as we go up in frequency response, what you get is not necessarily just a sense of better quality but it's much more intelligible. The differences between the fricatives and the plosives, and the ability to communicate more effectively, scales very well with the bandwidth," said Oxford. The phone also acts as a bridge when calling people on multiple formats. "You can have three people call on IP and one on PSDN on one number and it works. And all the people who call in on IP get the full 16khz bandwidth."