At Bell Labs
in the late 1950s, Flanagan continued to
research formant-based vocoders. His first technical assistant
was Bernie Watson—a name that couldn't help appealing
to Flanagan's puckish sense of humor.
"He often found a reason to say, 'Mr. Watson, come here, I want to see
you,' " says David.
After just three years, Flanagan was made a department head (although
throughout his career he has spent about 50 percent of his
time doing his own research). He and his team worked on an
artificial larynx, eventually getting more than 30 000 of
the devices out to people who needed them.
In the 1960s, digital computers began changing how research was done,
and speech research was no exception. Flanagan and his group
started using an IBM 650 mainframe computer, intending to
simulate a telephone transmission system that would enable
them to test future advances without building hardware. That
sounds commonplace today, but when Flanagan began this work,
such simple components as analog-to-digital (A-to-D) converters
did not exist. There was no way to get real-time analog signals
into a digital computer.
Instead, Flanagan and his group would use a photograph of an oscillogram
and measure the various amplitudes of the signal. Those numbers
were then recorded on punch cards—with a huge stack representing
a few seconds of speech—and fed into the computer. The
output would be a plot of processed waveforms.
This direction of research, Flanagan says, excitement in his vibrant
blue eyes, "opened up the field of signal processing, which
had not even existed before 1965 or thereabouts—and now
it's an IEEE society," the IEEE Signal Processing Society.
Making these physical measurements of waveforms to feed into the
computer gave Flanagan other ideas. Instead of measuring the
waveforms as generated by an oscilloscope, why not measure
the motion of the vocal cords themselves? Working with high-speed
motion pictures taken of the vocal cords via a dental mirror,
he measured the vibration and area of those cords and used
these data to compute spectral characteristics of the vocal
source.
The analysis supported new computer simulations of the interaction between
vocal cords and vocal tracts, using physiological factors
to form natural speech synthesis, an early example of what
we now call model-based coding. "People don't use this [type
of simulation] as a basis for a voice synthesizer yet," he
says, but he thinks they eventually will. "It's a frontier
challenge."
By the 1970s, Flanagan was still working on efficient transmission
of speech, but now it was in a digital world. Again, he turned
conventional methodologies upside down—or at least sideways.
Speech was being sent digitally through the telephone transmission
system by a method called pulse-code modulation, or PCM. A
PCM encoder samples signals at regular intervals and represents
the various amplitudes with binary numbers. Flanagan developed
a version of PCM that, instead of recording the amplitudes
themselves, encodes differences between successive samples
for transmission or storage, adapting the algorithm depending
on characteristics of the input signal.
This adaptive differential PCM immediately doubled the efficiency
of conventional digital telephone transmission. It enabled
digital telephone channels that had required 64 kilobits per
second of bandwidth to run at 32 kb/s. A further advance in
this direction—coding in sub-bands—reduced the rate
to 16 kb/s and led to the first Audix voice mail system, a
product that eventually became a huge business for AT and T.
Along the way to that success, Flanagan was awarded an early
patent for packet transmission of speech, one form of which
we now call voice over IP (though the patent expired long
before the technology was commercialized).
In the 1980s, the research team was working on efficient speech coding
for cellphones. Lawrence Rabiner, now a professor of electrical
and computer engineering at Rutgers University, New Brunswick,
N.J., and the University of California at Santa Barbara, was
then a Bell Labs researcher under Flanagan. He recalls that
Flanagan immediately began asking how music could be coded
in a similarly efficient way. He assigned researchers to work
on the challenge, and members of that team later developed
the ubiquitous MPEG-1 Layer 3 audio coding format, known as
MP3.
"Every time we solved one challenge, he was way ahead of us with
the next challenge," says Rabiner. "He felt that his job was
to produce a steady stream of out-of-the-box thinking."
Flanagan climbed steadily up the ranks at Bell Labs, eventually becoming
director of the Information Principles Research Laboratory.
But while taking on management functions, he always continued
his own work [see photo, "In the Lab"].
"I liked the fact that you could influence the
directions of work you considered important," he says with
a hint of a Mississippi drawl, "but I've always tried to balance
my own work interests with facilitating what other people
are doing."
Among his other projects, he pushed ahead with work in automatic
speech recognition, even though it didn't have much support
from higher-ups. Eventually, he used a Data General Nova 16-bit
minicomputer to build a telephone reservation system for air
travel.
"The ultimate fruit of that didn't come until 1992," Cox told IEEE
Spectrum, "when Voice Recognition Call Processing was
put into the telephone network. Jim had promised that there
would be tangible results and persisted in spite of the skepticism
and resistance he met. And it paid off hugely for AT and T."
In his spare time, Flanagan helped his brother, also an electrical
engineer, run the cotton farm and a cattle ranch back in Mississippi.
Balancing his two worlds, he had the ideal career, Flanagan
says, except for six months in 1973.