Image: Intel
|
What does the mortgage crisis have to do with
microprocessor architecture? It turns out that
calculating prices for those financially dubious
mortgage-backed securities is a division-intensive
process, and division has long been the weak link in a
microprocessor’s arithmetic operations. With Intel’s
new crop of 45‑nanometer processors, code-named
Penryn, the company is making the first substantial
upgrade in its processors’ divider since the original
Pentium came out in 1993. The speedup doubles the number
of bits calculated with each tick of the processor’s
clock and will make a substantial difference to
financial and scientific computing. And because Intel
powers so much of the computer market, the development
could tempt programmers to retreat from the less
accurate but faster software tricks they’ve used as a
substitute for division.
“Divide had become the long pole in the tent,” says
Steve Fischer, the lead architect for Penryn. “We tried
at least to chop the pole in half. It’s still long
compared to some functions. But it’s a lot better.”
The new divider is a variation on the old one, known
as SRT Radix-4. SRT (for Sweeny, Robertson, and Tocher)
is basically a souped‑up version of long division that
generates two bits of the answer with each step. The new
Radix‑16 divider works fundamentally the same way but
computes four bits in each step. Getting those other
two bits was no simple task. “I would say that the
divider really pushed the edge in terms of the max clock
frequency performance for Penryn,” says Fischer. Of all
the processor’s new architectural tricks, the divider
was most dependent on the chip’s 45-nm features and
redesigned transistors [see “The
High-k Solution,” IEEE Spectrum, October
2007].
The new divider is a nod to the importance of
scientific and financial calculations, which require
precise manipulation of large floating-point
numbers—a standard number format that includes a sign,
an exponent, and a fraction, all in a 32-, 64-, or
80‑bit package. “Sometimes software has avoided the
use of the divide in [favor of] a look-up table or some
approximation. Scientific work can’t rely on that,” says
Fischer.
Peter Markstein, a retired computer-arithmetic expert
who worked on the floating-point units for the Intel and
HP Itanium architecture and the IBM Power
architecture, thinks the new division rate might
influence how software is written. “People who use the
Intel architecture will, I think, be more inclined to
use division and not look for ways to avoid it,” he
says. Because they’ll be taking fewer inexact
shortcuts, computer simulations and other scientific
programs could come up with better answers. (The Power
and Itanium architectures do division with software that
relies on a circuit called a fused multiply adder.)
Penryn’s floating-point divider pulls it ahead of the
division scheme used in processors made by its main
rival, Advanced Micro Devices, for 32-bit quotients.
But Penryn only matches AMD’s divider for 64-bit
numbers, according to Chuck Moore, chief engineer for
AMD’s next generation of processors. Since its Athlon
chip debuted in 1999, AMD has been using a technique
called convergence. Unlike in SRT, which calculates bits
of quotient at a steady pace, convergence operates at an
accelerating pace, says Debjit Das Sarma, principal
member of the technical staff at AMD. Though convergence
takes more clock cycles than SRT Radix-16 to get to 32
bits, it takes fewer cycles to go from 32 bits to 64 and
would take fewer still to go to 80 or beyond.
In future processor architectures such as Bulldozer,
due out by 2010, AMD does not expect the number of clock
cycles required to finish a floating-point division to
change much. But the company is going for “a
substantial improvement” in the number of those
divisions the processor can work on at once, says Moore.
Despite some dedicated effort by the two Silicon
Valley rivals, “nobody is satisfied with these
division times,” says David Matula, a professor of
computer science at Southern Methodist University, in
Dallas, who has consulted for and competed against AMD
in the past. He believes that makers of scientific
software would be satisfied only if division took no
more than twice as long as multiplication. Still, “I’m
glad Intel is in the game again,” he says.