Cracking GO Continued
By Feng - Hsiung Hsu
First Published October 2007
IMAGE: Bryan Christie Design
|
|
|
The reason has to do with the nature of the accounting
system, which consists of a map of on/off bits that
tracks the “to” and “from” squares of each chess piece.
The program uses this bitmap to decide whether anything
has happened to invalidate the graft—for instance, by
making the winning move in a grafted branch illegal or
providing the losing side with a way out of a sequence
of forced moves. It turns out in chess that if grafted
branches contain more than one ply of nonforcing moves,
the bitmaps will quickly cover most of the board, the
accounting system will become unmanageable, and the
grafting operation will fail.
In Go, however, the method of analogy should be much
more useful. Because the board is so large (19 by 19
versus 8 by 8 in chess), a battle typically occurs in a
relatively small part of it, so the bitmaps will mostly
have “off” bits, making it more likely for them to be
useful. Also, the program can generally reuse the maps
many more times than in chess, because each of the many
local battles tends to be unaffected by battles
elsewhere. Therefore, the program should be able to
graft deep branches—the kind needed to decide
life-and-death questions—from one part of the game tree
to another.
This ability to answer life-and-death questions
cheaply is vital if the brute-force approach is to work.
To determine whether a group of pieces will live or die
the program may have to search from 1000 to 1 000 000
positions. That wouldn't be so bad, really, if it were
the extent of the problem. It isn't. In a typical game,
we may easily have more than 10 such problems on the
board at the same time, and the status of one group can
affect that of its neighbors—like a cowboy who points
a revolver at another cowboy only to find himself
covered by a rifleman on a roof. Such interactions can
complicate the problem by something on the order of
taking 1 million to the 10th power—enough to stretch a
calculation lasting a microsecond into one vastly
dwarfing the age of the universe.
This is where the bitmaps we mentioned earlier come to
the rescue. They make it easy to tell when maps do and
do not intersect and also allow caching to work, thereby
drastically reducing the cost of dynamic search required
for proper evaluation of positions. It is conceivable
that with caching techniques, including but not limited
to the method of analogy, it may take no more than 1000
to 1 000 000 nodes (or one individual life-and-death
decision tree) of dynamic search to properly evaluate an
end position. Although that's more expensive than in the
case of chess, it's manageable.
What, then, can we
expect from the hardware? Deep Blue used
0.6-micrometer CMOS technology, kind of creaky even in
1997. Each of its 480 custom-designed processors
searched up to 2.5 million positions per second. The
theoretical peak speed was more than 1 billion positions
per second, but the sustained speed was only 200 million
positions per second because of communication overhead,
load-balancing issues, and implementation inefficiency.
Today 45-nanometer process technology is just getting
into production. With it, a machine searching as fast as
Deep Blue could easily fit on a single chip. In fact,
with gains expected from technology and from
optimization of chip architecture, a single-chip machine
could actually be more than 100 times as fast as Deep
Blue. If we then made 480 copies of that monster chip
and integrated them all in a parallel architecture, we
could get at least another 100-fold increase in
computational power. On top of that, in 10 years Moore's
law is expected to present us with still another
100-fold speedup.
Put it all together and you should be able to build a
machine that searches more than 100 trillion positions
per second—easily a
million times as fast as Deep Blue.
That would be enough to build a tree of analysis for
Go as big as Deep Blue's was for chess and to evaluate
all its end positions properly. If we assume the top Go
players calculate about as deeply as the top chess
players do, the result should be a machine that plays Go
as well as Deep Blue played chess.
Well enough, that is, to beat any human player.
My gut
feeling is that with some optimization a
machine that can search a trillion positions per second
would be enough to play Go at the very highest level. It
would then be cheaper to build the machine out of FPGAs
(field-programmable gate arrays) instead of the much
more expensive and highly unwieldy full-custom chips.
That way, university students could easily take on the challenge.
At Microsoft Research Asia we are seeding university
efforts in China with the goal of solving some of the
basic problems. Whether these efforts lead to a
world-champion Go machine in the next decade remains to
be seen. I certainly wouldn't bet against it.
About the Author
FENG-HSIUNG HSU earned a Ph.D. in computer science
at Carnegie Mellon University, Pittsburgh, where he
and fellow students designed the first
grandmaster-level chess machine. Then he moved to
IBM to develop its successor, Deep Blue, which beat
World Champion Garry Kasparov in 1997. Hsu now
manages the platforms and devices center of
Microsoft Research Asia, in Beijing.
To Probe Further
For a full account of the IBM project to build a
chess machine, see Behind Deep Blue: Building
the Computer That Defeated the World Chess
Champion, by Feng-hsiung Hsu, Princeton
University Press, 2004.
To experiment with a Go program, readers can
download GNU Go at http://www.gnu.org/software/gnugo.
Offered by the Free Software Foundation, in Boston, this
free program has performed well in recent computer Go events.