November 14th, 2006 SUPERCOMPUTING: A NEW WHO'S WHO
Today, Associate Editor Erico Guizzo looks at the latest contestants in the race for the title of World's Fastest Computer. The winner may not be a new name, he notes, but its challengers are coming up fast on its heels.
Erico Guizzo
The folks over at the TOP500 supercomputers ranking project officially released the latest edition of their findings today at the SC06 high-performance computing conference in sunny Tampa, Fla. The release of the new TOP500 list, which comes out twice a year, has become a highly anticipated event—if you happen to be part of the high-performance computing community, that is. The biggest news this time is a major shakeup in the top 10 machines, which shows "how the field remains both constant and constantly changing," the TOP500 organizers said. Here are some highlights.
The No. 1 system, the mighty IBM Blue Gene/L at the U.S. Department of Energy's Lawrence Livermore National Laboratory, in Livermore, Calif., retained the top spot. With a performance of 280.6 teraflops (trillions of floating-point operations per second), it should remain there for a while. (Just consider that the No. 2 machine is only half as powerful.) The IBM monster has been the No. 1 system since early last year (see "IBM Reclaims Supercomputer Lead"), when it took the top spot from Japan's famed Earth Simulator, which had reigned supreme for two and a half years and now has slid down to the No. 14 position. The Blue Gene system relies on custom building blocks, with two PowerPC processors, memory, communications functions, and extra circuitry to speed up floating-point operations. IBM's competitors, meanwhile, are making good progress using conventional AMD and Intel processors and industry-standard networking systems like gigabit Ethernet, Myrinet, and InfiniBand. So will Blue Gene stay at the top? Time will tell. Or as Jack Dongarra, one of the Top500 organizers, told Spectrum last year, the number of teraflops of a machine is just "a trophy" and that "we all know that that trophy won't last forever."
The No. 2 machine, Sandia National Laboratories' Cray Red Storm, seems to prove that upgrades can sometimes pay off handsomely. The upgrade was from single- to dual-core processors. The machine ranked at the No. 9 spot six months ago, at 36.19 teraflops, using 10 880 2.0-GHz AMD Opteron processors. Now, with 26 544 2.4-GHz dual-core Opteron processors, its performance skyrocketed to 101.4 teraflops, the second machine ever to break the 100 teraflop barrier. The feat also shows that veteran supercomputer maker Cray is in the game with some serious computing offerings. The Red Storm was an important machine for Cray, because the company based its next-generation systems—the Cray XT3 and XT4—on that machine's architecture. That means using mostly off-the-shelf parts like commodity processors, in combination with some custom systems like its SeaStar networking chips and high-speed 3-D interconnect systems. With that architecture, Cray has recently embarked on a new strategy, which it calls "adaptive supercomputing"—the combination of different processing architectures (scalar, vector, multithreaded, hardware accelerators) into a single system. Cray's mantra is adapt the system to the application, not the application to the system. "We're following the vision we laid out a year ago and it's becoming real, so we're pretty excited about it," Steve Scott, Cray's CTO told me this week during a briefing.
The interesting thing about the No. 5 machine, MareNostrum at the Barcelona Supercomputer Center in Spain, is not just that it resides inside what was once a church in Barcelona. (Okay, that is interesting too.) But perhaps more important is that this machine is a cluster of blades, a server technology that has been responsible for what some experts call a "quiet revolution in the server room" (see "Blades Have The Edge"). This massive computer cluster uses 2560 IBM blade servers in 44 racks, which take up about a 120 square meters inside the old church (call it a "mass" of blades). At 62.63 teraflops, MareNostrum is now the largest system in Europe.
In the No. 9 position comes the fast and furious Tsubame supercomputer of the Tokyo Institute of Technology. Another upgrade here. But this machine got some fancy additions: ClearSpeed number-crunching acceleration boards. Developed by English chip-design firm ClearSpeed Technology, in Bristol, these computer boards have two massively parallel floating-point coprocessors designed to accelerate math-intensive applications. The board fits the PCI-X slot available in many PCs, servers, and workstations, and it sustains 50 billion floating-point operations per second (50 gigaflops) while dissipating only 25 watts. The idea is that you would put one or more boards in a server and then hook up many servers to get a cheaper supercomputer. A while ago when I asked a prominent supercomputing expert about this idea, he was skeptical: "I have heard this story before. The proof is in the performance." Now, ClearSpeed is showing the numbers. The company brags that its boards boosted Tsubame's performance to 47 teraflops from the non-accelerated performance of 38 teraflops, a 24 percent boost with only a 1 percent increase in energy consumption. The supercomputer, assembled by NEC using Sun Microsystems servers equipped with AMD Opteron processors, is now the fastest in Japan. About the machine, ClearSpeed's CEO Tom Beese told me this past June: "Given the challenge I have of putting a PC at home, I was amazed that they could bring up such a powerful system so fast."
SUPERCOMPUTING: A NEW WHO'S WHO
Today, Associate Editor Erico Guizzo looks at the latest contestants in the race for the title of World's Fastest Computer. The winner may not be a new name, he notes, but its challengers are coming up fast on its heels.

Erico Guizzo
The folks over at the TOP500 supercomputers ranking project officially released the latest edition of their findings today at the SC06 high-performance computing conference in sunny Tampa, Fla. The release of the new TOP500 list, which comes out twice a year, has become a highly anticipated event—if you happen to be part of the high-performance computing community, that is. The biggest news this time is a major shakeup in the top 10 machines, which shows "how the field remains both constant and constantly changing," the TOP500 organizers said. Here are some highlights.
The No. 1 system, the mighty IBM Blue Gene/L at the U.S. Department of Energy's Lawrence Livermore National Laboratory, in Livermore, Calif., retained the top spot. With a performance of 280.6 teraflops (trillions of floating-point operations per second), it should remain there for a while. (Just consider that the No. 2 machine is only half as powerful.) The IBM monster has been the No. 1 system since early last year (see "IBM Reclaims Supercomputer Lead"), when it took the top spot from Japan's famed Earth Simulator, which had reigned supreme for two and a half years and now has slid down to the No. 14 position. The Blue Gene system relies on custom building blocks, with two PowerPC processors, memory, communications functions, and extra circuitry to speed up floating-point operations. IBM's competitors, meanwhile, are making good progress using conventional AMD and Intel processors and industry-standard networking systems like gigabit Ethernet, Myrinet, and InfiniBand. So will Blue Gene stay at the top? Time will tell. Or as Jack Dongarra, one of the Top500 organizers, told Spectrum last year, the number of teraflops of a machine is just "a trophy" and that "we all know that that trophy won't last forever."
The No. 2 machine, Sandia National Laboratories' Cray Red Storm, seems to prove that upgrades can sometimes pay off handsomely. The upgrade was from single- to dual-core processors. The machine ranked at the No. 9 spot six months ago, at 36.19 teraflops, using 10 880 2.0-GHz AMD Opteron processors. Now, with 26 544 2.4-GHz dual-core Opteron processors, its performance skyrocketed to 101.4 teraflops, the second machine ever to break the 100 teraflop barrier. The feat also shows that veteran supercomputer maker Cray is in the game with some serious computing offerings. The Red Storm was an important machine for Cray, because the company based its next-generation systems—the Cray XT3 and XT4—on that machine's architecture. That means using mostly off-the-shelf parts like commodity processors, in combination with some custom systems like its SeaStar networking chips and high-speed 3-D interconnect systems. With that architecture, Cray has recently embarked on a new strategy, which it calls "adaptive supercomputing"—the combination of different processing architectures (scalar, vector, multithreaded, hardware accelerators) into a single system. Cray's mantra is adapt the system to the application, not the application to the system. "We're following the vision we laid out a year ago and it's becoming real, so we're pretty excited about it," Steve Scott, Cray's CTO told me this week during a briefing.
The interesting thing about the No. 5 machine, MareNostrum at the Barcelona Supercomputer Center in Spain, is not just that it resides inside what was once a church in Barcelona. (Okay, that is interesting too.) But perhaps more important is that this machine is a cluster of blades, a server technology that has been responsible for what some experts call a "quiet revolution in the server room" (see "Blades Have The Edge"). This massive computer cluster uses 2560 IBM blade servers in 44 racks, which take up about a 120 square meters inside the old church (call it a "mass" of blades). At 62.63 teraflops, MareNostrum is now the largest system in Europe.
In the No. 9 position comes the fast and furious Tsubame supercomputer of the Tokyo Institute of Technology. Another upgrade here. But this machine got some fancy additions: ClearSpeed number-crunching acceleration boards. Developed by English chip-design firm ClearSpeed Technology, in Bristol, these computer boards have two massively parallel floating-point coprocessors designed to accelerate math-intensive applications. The board fits the PCI-X slot available in many PCs, servers, and workstations, and it sustains 50 billion floating-point operations per second (50 gigaflops) while dissipating only 25 watts. The idea is that you would put one or more boards in a server and then hook up many servers to get a cheaper supercomputer. A while ago when I asked a prominent supercomputing expert about this idea, he was skeptical: "I have heard this story before. The proof is in the performance." Now, ClearSpeed is showing the numbers. The company brags that its boards boosted Tsubame's performance to 47 teraflops from the non-accelerated performance of 38 teraflops, a 24 percent boost with only a 1 percent increase in energy consumption. The supercomputer, assembled by NEC using Sun Microsystems servers equipped with AMD Opteron processors, is now the fastest in Japan. About the machine, ClearSpeed's CEO Tom Beese told me this past June: "Given the challenge I have of putting a PC at home, I was amazed that they could bring up such a powerful system so fast."