Grids seem to be a
natural evolution in the history of
distributed computing. If you look at some of the early
supercomputers, these machines were refrigerator-size
cabinets that would divide the computing workload among
multiple processors. Then came clusters, groups of
relatively cheap, off-the-shelf computers, typically PCs
running Linux, that would form large parallel processing
systems sprawling across entire rooms, buildings, or
even campuses. But with computer networks becoming
faster and cheaper, some researchers figured that an
even more diverse and dispersed union of machines would
be possible.
These researchers envisioned an infrastructure that,
unlike supercomputers or clusters, would be owned,
managed, and used by multiple organizations. And instead
of a monolithic hardware and software design, this
infrastructure would run on a mix of operating systems,
file systems, and networking technologies. Thus emerged
the idea of grid computing, and a bunch of grid pioneers
went to work on realizing that vision. Ian Foster of the
Argonne National Laboratory, in Illinois, and Carl
Kesselman of the University of Southern California, in
Los Angeles, were among the pioneers, and in 1998 they
published The Grid:
Blueprint for a New Computing
Infrastructure (Morgan Kaufmann), a book that
became an instant bible for the new field.
At least in theory, grids would include all kinds of
systems: supercomputers, giant clusters, and desktop
PCs, as well as storage devices, databases, sensors, and
scientific instruments. But although many grid projects
are evolving in that direction, most still amass a more
homogeneous collection of systems.
The EGEE grid consists mainly of multiple clusters of
PCs—some institutions have a dozen machines, others
thousands—connected to “farms” of disk servers and
specialized magnetic tape silos that are used for backup
and long-term storage of data. The grid relies on the
Internet and on high-speed, dedicated research networks
to distribute computing tasks among the clusters owned
by different parties, just as if the machines were
sitting in the same room. Of the EGEE's networks, the
most important is called Géant2, an academic fiber-optic
backbone that links 34 European countries and has
connections to similar research networks elsewhere in
the world.
The sheer computational power of some supercomputers
today still exceeds EGEE's, although that could change
someday, depending on how fast this grid grows and how
quickly it can join forces with other major national and
international grid efforts. For example, the Open
Science Grid, a U.S. initiative that already links a
large number of data centers in more than 50
institutions, mirrors in many ways the European Union's
grid initiative and should be interoperable with EGEE
soon. In Japan, a project called the National Research
Grid Initiative is developing a grid infrastructure for
science very similar to EGEE, and collaboration between
the projects has already started.
But the fact is that such grids, even joined together,
won't replace supercomputers quite yet. Some
problems—certain types of climate simulations, for
instance—involve calculations that are so intertwined
that a supercomputer's multiple processors need to
exchange data at dazzlingly fast speeds, a capability
difficult to achieve with grids. Grids like EGEE aim at
what's called high-throughput computing—dealing with
large amounts of similar but independent calculations.
In other words, the applications that benefit best from
grids are those that can be chopped into many smaller
pieces and processed in parallel.
Although grids remain mostly an academic research
tool, they are making strides into the corporate world.
Many large firms, after investing in technology for
e-business, customer relationship management, and supply
chain systems, are now putting computing grids high on
their acquisition lists. Early adopters include
financial institutions, some of which are using grids to
perform sophisticated risk analysis, and pharmaceutical
companies, which are using grids to study the effects of
new drugs.
With an eye on this market, all the major computer
vendors, including Hewlett-Packard, IBM, Microsoft, and
Sun, now offer hardware and software that enable
servers, PCs, and mainframes to tap into the power of a
grid. Useful though such commercial offerings are, they
can't yet manage the number of computers, networks, and
systems in a massive grid like the EGEE. What's more,
different vendors ended up developing different grid
technologies. So unlike the Web—developed at CERN,
incidentally—which is based on a common set of
standards, existing grids are based on a wide variety of
technologies. The field, it seems, is too immature for
any one company to risk launching products and services
on a large scale. Industry in this case adopted a
wait-and-see policy, letting the academic community take
the lead, as has often happened with emerging technologies.
Indeed, this is why major grid projects still need
public funding. International grid initiatives such as
EGEE aim to take the grid model one step further by
developing and testing computing, networking, and
security technologies and pushing for common standards.
This complements the efforts of international standards
bodies, such as the Global Grid Forum, which champions
technological convergence and interoperability.
But despite such efforts toward standardization, the
term grid
computing—as is often the case with
technological buzzwords—has come to mean different
things to different people. A source of confusion is the
concept of on‑demand or utility computing created by
computer vendors. The idea is that a customer needing
extra processing power can tap into a vendor's data
center, paying for what it uses. This is an interesting
concept, but it doesn't strictly require grid technology.
And then there are systems for scavenging computing
power such as SETI@home, the popular screen saver that
relies on ordinary people's PCs to search
radio-astronomy data for signs of extraterrestrial
intelligence. Considering that SETI@home has been
downloaded onto more than 5 million PCs around the
globe, the ambitions of a project such as EGEE to
federate 100 000 computers seem modest by comparison.
But the similarities belie major differences. The sort
of scientific software running on the EGEE grid involves
complex configurations and requires reliable and secure
data transfers that casually connected PCs cannot provide.