The world's leading source of technology news and analysis
Search Spectrum IEEEXplore Digital Library Submit
Font Size: A A A
IEEE
Home [Alt + 1] Magazine [Alt + 2] Bioengineering [Alt + 3] Computing [Alt + 4] Consumer [Alt + 5] Power/Energy [Alt + 6] Semiconductors [Alt + 7] Communications [Alt + 8] Transportation [Alt + 9]

Winner: Cure for the Multicore Blues Continued By Harry Goldstein

First Published January 2007
emailEmail PrintPrint CommentsComments ()  ReprintsReprints NewslettersNewsletters

The RapidMind platform started as a language, called “Sh,” that McCool developed for graphics processors. The language grew out of an insight McCool had years ago—that the massively parallel computing provided by a graphics processor’s multiple cores can be used for things other than rendering pixels.

Recent results bear this out. Researchers at Hewlett-Packard, in Palo Alto, Calif., reported in November that a graphics processor programmed with the RapidMind platform executed an options-­pricing program called the “Black-Scholes benchmark” 32.2 times as fast as a general-­purpose CPU.

McCool, who exudes an endearingly geeky bravado and who still teaches computer science at the University of Waterloo, in Ontario, says putting the RapidMind platform in the hands of the people who need it the most was the best way to realize the full potential of his research. “It wasn’t really about us making lots of money—although that’s nice,” he says. “For me it was about cool technology and using it in the real world with real customers.”

So three years ago, he asked his research assistant Stefanus Du Toit to use the Sh language to create a programming platform for multicore processors. Together, McCool and Du Toit founded the company that would become known as RapidMind.

It took Du Toit about a year, but in the end, he and McCool had something good enough to show Matthew Monteyne, a former senior product manager with Waterloo’s most famous technology company, Research in Motion (RIM), maker of the BlackBerry wireless e-mail device. Monteyne, now vice president of sales and marketing at RapidMind, recruited his former boss, Ray DePaul, director of BlackBerry product management, to come on board as president and CEO. In McCool’s prototype, they both sensed an unusual opportunity.

“There hasn’t been a revolution in processors and how you program them since maybe object-oriented programming in the early ’90s,” DePaul says.

The introduction of a disruptive technology like multicore CPUs provides a great chance for small companies to pounce. “You don’t come into mature markets,” DePaul says. “You come in when there’s this whirlwind of activity, and the big guys are too focused on the current business that they can’t go after the new opportunity.”

McCool’s goal for a commercial product was simple enough: “I wanted to build something that I could teach in about 10 minutes, that you could use without mental overhead so you can focus on the algorithms, not the details of the particular processor,” he explains.

Programmers need to focus on devising parallel algorithms because RapidMind can’t write parallel algorithms for them. No software can. While there has been a lot of research into automatically parallelizing applications for programmers, no such system has been commercially viable. “People have been working on this for 20 or 30 years, and it doesn’t look like it’s a solvable problem,” McCool says.

That means programmers accustomed to writing serial algorithms must learn how to think about parallel algorithms. One of the benefits of working with the RapidMind platform is that users become familiar with a conceptual model of a parallel machine. “It’s similar enough to a real parallel machine that you can reason about what is an efficient way to implement an algorithm,” McCool says.

To write an application using RapidMind, the programmer first identifies the components to accelerate. These tend to be the numerically intensive operations. For instance, a chip running a game might spend a lot of time computing physical interactions between hundreds of thousands of objects, computations that would speed up tremendously if done in parallel. That’s in contrast to trivial operations such as tabulating the player’s score or processing input from a game-­controller button or joystick.

The RapidMind platform is designed to be incorporated into any program written in C++, one of the most widely used programming languages in the world. Programmers write their programs in C++, using their favorite C++ editing and debugging programs, of which there are hundreds. Next they select the portion of the program to be accelerated and formulate the necessary parallel algorithms. Then they write code that expresses those algorithms.

Several features make the task easier. Like any modern high-level programming language, C++ has a library of commonly needed subroutines and functions, simplifying life for programmers. When they need one of those functions—sorting a set of numbers, say—they merely insert a word in their program that calls it up. However, while working with the RapidMind platform, instead of writing code using ordinary C++ terms that refer to subroutines and functions in a C++ library, the programmer uses words from RapidMind’s vocabulary that refer to subroutines and functions stored in the RapidMind library. These words call up subroutines and functions that execute in parallel. The programmer must specify the data sets that will be operated on in parallel, but the subroutines take it from there.

Programmers don’t need to know any of the specifics of the chip their software will run on. When the program starts up, the RapidMind platform determines whether it is running on a graphics ­processor, a Cell, or something else and translates the code that the programmer has written into code that the particular chip understands.

At the same time, the platform breaks up arrays of data into chunks that get doled out to however many cores are available on the target chip. The more cores, the more finely the chunks are chopped. To ensure that each core is working on something all the time, the system assigns data and tasks to cores on the fly, depending on which ones signal that they are free for the next piece of work. So, for example, while one core is churning through an especially complicated operation for a long time, its fellow cores can be kept busy with lots of simpler operations.

What the Experts Say GORDON BELL: Computer scientists haven’t been interested in programming clusters. If ­putting the cluster on a chip is what excites them, fine. It will still have to run Fortran!

Without such dynamic load balancing, computationally intensive applications, including real-time ray tracing, are extraordinarily difficult to pull off. Real-time ray tracing is a technique that models the paths and effects of light as it interacts with various surfaces. Typically, millions of rays hit dozens or hundreds of objects, where the rays can be absorbed, reflected, or refracted. Of course, most of the rays miss the objects and keep going—events that McCool calls cheap operations because the path they trace remains the same. The expensive calculations, the ones that must be performed to determine the trajectory of a ray when it hits a drop of water, say, can require 100 times as much work on the processor’s part.

Because the RapidMind platform can dynamically allocate both cheap and expensive tasks, the ray-tracing application can take full advantage of the power of parallel processing to execute in real time. That’s because there are so many pixels whose color and shading need to be determined at any one time that all the processors can be occupied with computational tasks. Compare that to, say, an Intel Xeon dual-core chip running an operating system, a Web browser, and some desktop applications. Its two processors might sit idle half the time waiting for something to do. The RapidMind platform strives to ensure that no core—or clock cycle, for that matter—goes to waste.


« Previous Page 2 of 3 Next »
emailEmail PrintPrint CommentsComments ()  ReprintsReprints NewslettersNewsletters


WHITE PAPERS

Featured White papers:

More»

White papers:

      More»