Illustration: Patrick Kalyanapu
|
Sometime next
year, if all goes as planned, the largest
scientific instrument ever built will come to life in a
labyrinthine underground complex in Switzerland, near
Geneva. Buried more than 100 meters down, the Large
Hadron Collider (LHC) will send two beams of protons in
opposite directions around a 27-kilometer-long circular
tunnel. The beams, whizzing at nearly the speed of
light, will collide head-on, producing a shower of
subatomic fragments that scientists expect will include
exotic, never-before-seen particles that could change
our fundamental knowledge of the universe.
That's the hope, anyway. Researchers at the European
Organization for Nuclear Research (CERN), which will
operate the LHC, know that spotting the elusive bits of
matter they are looking for will be a daunting task. To
find them, the researchers will have to sift through a
colossal haystack of collision data: the LHC is expected
to spew out some 15 million gigabytes a year—on
average, that's more than enough to fill six standard
DVDs every minute.
Storing and analyzing the mountain of data, it turns
out, is a task that no supercomputer in the world can
handle. So while the LHC team rushes to finish the
mammoth subterranean machine, above ground another group
of physicists and computer scientists has been solving a
problem of its own: assembling a computing
infrastructure able to handle LHC's data deluge. Their
solution? A vast collection of high-powered computer
systems scattered in nearly 200 research centers around
the world, networked and configured to function as a
single parallel processing system. This type of
infrastructure is known as a computing grid.
Computing grids emerged in the late 1990s as an
alternative to traditional supercomputers to solve
certain problems demanding powerful number crunching and
access to larger amounts of distributed data. The idea
was that with sufficiently fast networks and the right
software, multiple and geographically dispersed research
groups could pool their computing and data management
resources into a unified system capable of tackling
problems that would be out of reach for any of them
alone. Such grids, those early researchers hoped, would
do for serious computing power what electricity grids
did for electricity: make it available everywhere. Just
plug your PC into a computing grid and you'd have
instant access to supercomputing power at an affordable cost.
We are not quite there yet. Today, although grids have
sprung up all over the place, most of them are
specialized systems available to only a small cadre of
researchers in fields such as high-energy physics,
genome research, and earthquake monitoring. How, then,
can we turn grids into an everyday research tool that
can energize a wider range of scientific and technical pursuits?
That is the question CERN and its partner
universities, research agencies, and companies—most of
them in Europe but some in the United States, Asia, and
Latin America—hope to answer by building on the
experience of the LHC grid to create a massive global
grid infrastructure. Led by CERN, the group wants to
transform this new global grid into a tool capable of
solving a great variety of problems in science,
engineering, and industry.
The initiative, funded by the European Union, is
called Enabling Grids for E-sciencE (EGEE). Behind the
awkward acronym lies an ambitious effort [see
illustration, “Going
Global”]. The EGEE grid now combines the
processing power of more than 20 000 CPUs, a storage
capacity of about 5 million GB—growing rapidly in
anticipation of the LHC data—and a global network
connecting some 200 sites in such places as Paris,
Moscow, Taipei, and Chicago. The grid is already
crunching test data for the LHC experiments [see
sidebar, “”] and also for dozens of applications
in such areas as astrophysics, medical imaging,
bioinformatics, climate studies, oil and gas
exploration, pharmaceutical research, and financial
forecasting. It's now the world's largest
general-purpose scientific computing grid, and it's
getting bigger every month.