Thanks to an authoritative U.S.-Canada report, we now know that negligence by a utility in Ohio and lax oversight by a rookie regulator precipitated the blackout that darkened much of the North American upper Midwest and Northeast a year ago. Paradoxically, however, when the same remarkable event is seen in a wider historical and statistical perspective, it is no less natural than a sizable earthquake in California. Major outages occurred in the western U.S. grid just eight years ago. And last fall, electric power systems collapsed in Denmark, Italy, and the United Kingdom within weeks or months of the U.S. blackout.
The 14 August 2003 blackout may have been the largest in history, zapping more total wattage and affecting more customers than any before, but if history is any guide, it won't be the last. "These kinds of outages are consistent with historical statistics, and they'll keep happening," says John Doyle, professor of control and dynamical systems, electrical engineering, and bioengineering at the California Institute of Technology in Pasadena. "I would have said this one was overdue."
"We will have major failures," agrees IEEE Fellow Vijay Vittal, an electrical engineering professor at Iowa State University in Ames, who is an expert on power system dynamics and control. "There is no doubt about that."
The numbers on blackouts bear out this fatalism. Extrapolating from the small outages that occur frequently, one might expect a large power grid to collapse only once in, say, 5000 years. But between 1984 (when North American utilities began to systematically report blackouts) and 2000, utilities logged 11 outages affecting more than 4000 megawatts--making the probability of any one outage 325 times greater than mathematicians would have expected. Thus, statistically speaking, the blackout on 14 August, which, according to the U.S. Department of Energy, cost between US $4 billion and $6 billion, was no anomaly [see graph, Only Too Likely"].
Only Too Likely:
Work at Carnegie Mellon University shows that the likelihood of large failures is greater than one would expect on the basis of extrapolations from small failures. The brown curve is fit to actual outages that affected more than 500 megawatts of power; the blue curve is an exponential distribution fit to failures smaller than 800 MW. The silhouette in the background is of the New York City skyline
In the mid-1990s--well before FirstEnergy in Akron, Ohio, got sloppy with its tree-trimming and monitoring systems last summer--mathematicians, engineers, and physicists set out to explain the statistical overabundance of big blackouts. Two distinct models emerged, based on two general theories of systems failure.
One, an optimization model, championed by Caltech's Doyle, presumes that power engineers make conscious and rational choices to focus resources on preventing smaller and more common disturbances on the lines; large blackouts occur because the grid isn't forcefully engineered to prevent them. The competing explanation, hatched by a team connected with the Oak Ridge National Laboratory in Tennessee, views blackouts as a surprisingly constructive force in an unconscious feedback loop that operates over years or decades. Blackouts spur investments to strengthen overloaded power systems, periodically counterbalancing pressures to maximize return on investment and deliver electricity at the lowest possible cost.
Which of these models better explains the mechanism behind large blackouts is a matter of intense--sometimes even bitter--debate. But their proponents agree on one thing: the brave, can-do recommendations of the U.S.-Canada task force report won't eliminate large blackouts. If either conscious optimization or unconscious feedback sets up power systems to fail, then large cascading blackouts are natural facets of the power grid. Stopping them will require that engineers fundamentally change the way they operate the power system. "I don't think there are simple policy fixes," says Doyle.
Of course, the very idea of accepting the inevitability of blackouts is utterly rejected by utility officials and politicians. Certainly the mainstream view among power system engineers continues to be that the answer to reliability problems is to make the grids more robust physically, improve simulation techniques and computerized real-time controls, and improve regulation. What the systems theorists suggest is that even if all that is done and done well--as, of course, it should be--the really big outages still will happen more often than they should.
The Suspicion That Nasty Surprises lurk in the inner workings of power grids began to take shape in the early 1980s with the growth of research into nonlinear systems, a field that became known as chaos theory. The term was a misnomer, for chaos experts were describing layers of order hidden in the apparent disorder of everything from turbulent fluids to celestial mechanics.
In November 1982, a pair of mathematicians made one of the first attempts to apply chaos theory to power grids. Nancy Kopell, at that time a nonlinear dynamics expert at Northeastern University in Boston, and Robert Washburn, a mathematician and chief scientist with Alphatech Inc., a Boston-based systems-engineering consulting firm, were novices to electrical power systems. But what they found revolutionized thinking about power system behavior.
Kopell and Washburn's insight was to recognize that the differential equations used to describe the dynamic interactions of power generators on a grid--known as swing equations, which remain a critical tool for power system modelers--resemble the equations developed by the 19th-century mathematician Henri Poincare to describe the gravitational interplay among celestial bodies. Adapting Poincare's techniques, Kopell and Washburn managed to model more accurately the behavior of a simple grid with three generators--two large and one small.
The results were analogous to what Poincare found when he considered the behavior of two large bodies and a third that is relatively small. In that case, tiny shifts in the relative position and motion of the large bodies dramatically altered the trajectory of the third. In modern parlance, we'd say that Poincare's system is chaotic. Kopell and Washburn observed the same behavior in their three-machine power grid in response to simulated faults on the lines: tweak the operating parameters of the large generators just slightly, and a previously stable grid would suddenly run away.
By the early 1990s, power systems experts were exploiting the techniques and discovering chaotic behavior in more complex models. Power systems expert James Thorp, an engineering professor at Cornell University, plotted the results from models with dozens of generators and lines, producing fractal patterns that are the hallmark of chaos mathematics [see fractal image, Random Patterns].
Graphical models of power grids show sharp boundaries between areas in which a system arrives at a stable equilibrium and those in which it becomes unstable or runs away. In the computer representation shown here, which was done by Cornell University's James Thorp, the interactions of two generators are mapped, each point representing the angle by which each generator is out of phase compared with a reference generator. Light blue areas represent grid stability, while the dark red, light red, and purple areas show the grid to be unstable or vulnerable to collapse. Small changes in the state of either generator can produce large and unpredictable changes in the grid's stability.
Yet these models still seemed too simplistic to be applicable to real-life power grid situations.
"The fact that you see transient chaos was enough to convince people that the power system is much more complicated than we might have imagined, but there was not an obvious connection to blackouts," says Thorp.
The connection between chaos and blackouts began to tighten when researchers started to work with actual blackout data. In the mid-1990s, Doyle, at Caltech, began to mine data on blackouts that had been collected since 1984 by the North American Electric Reliability Council, the organization in Princeton, N.J., that promotes voluntary standards for the electric power industry. A team consisting of Benjamin A. Carreras, an expert in chaos theory at Oak Ridge National Laboratory; David Newman, now a professor of physics at the University of Alaska; and Ian Dobson, a University of Wisconsin professor of electrical and computer engineering and an expert on chaos and power grids, stumbled on the same data in 1997.
What Doyle and the Carreras-Newman-Dobson group found amazed them. Plotting the logs of the frequency of blackouts versus their magnitude, they observed that the frequency of large blackouts was much higher than they expected. Rather than falling off sharply to fit the bell curve produced by a Gaussian, or normal, distribution, the frequency of blackouts fell off much more slowly. The curve fit what is called a power law--which refers not to the power in a circuit but to the fact that the probability of a blackout is related to its magnitude by some constant exponent.
The result excited the system-dynamics and chaos experts because such power-law frequency distributions are a signature of complex, chaotic systems in which the interplay of the components leads to surprising outcomes. Other examples of complicated events that seem to occur with similar regularity are earthquakes, forest fires, and dam failures.
Systems analysts think they know something about the dynamics that lead to such events; so the discovery of a similar probability distribution gave them hope that they could learn a thing or two about blackouts. "We said there must be something about the way the grid is managed that makes all these points want to be on a line," says Carreras. "They are not jumping around. It's as if there is a physical law there."
One thing they knew for sure was that phenomena that fit such distributions tend to occur with remarkable consistency. Devastating earthquakes may be hard to predict, but we know when one is overdue. So when the 14 August blackout struck, the systems theorists raced to their plots to see if this additional piece of data fit the pattern.
Thorp went straight back to his office when the lights came back on at Cornell in upstate New York, took one of Doyle's plots, and extended the curve farther out to the right, from blackouts affecting millions of customers to blackouts affecting tens of millions. The curve predicted that an outage of the scope seen a year ago should occur, on average, every 35 years. The result was chilling, for it had been 38 years since the last cascading outage on the Eastern Interconnection (the transmission system connecting the eastern U.S. seaboard, the Plains states, and the eastern Canadian provinces). That outage, on 9 November 1965, blacked out 30 million people in the northeastern United States and Canada.
For Systems Theorists
like Doyle and Carreras, the first message of their eerily smooth distribution curves is clear: big blackouts are a natural product of the power grid. The culprits that get blamed for each blackout--lax tree trimming, operators who make bad decisions--are actors in a bigger drama, their failings mere triggers for disasters that in some strange ways are predestined. In this systems-level view, massive blackouts are just as inevitable as the megaquake that will one day level much of Tokyo. Just the same, accounting for that inevitability is a contentious exercise.
To date, Carreras, Dobson, and Newman's explanation for the curves--the feedback model--is the most vivid and, arguably, the most sophisticated. Computer simulations to test this model track as many as 400 power lines and 30 or so generators and run for the equivalent of 250 years. The results are uncannily similar to the historical record.
Carreras and his colleagues were inspired by a simple physical system: the growth of sand piles. In the 1990s, physicists studying sand piles mathematically modeled a phenomenon long noticed by children playing on beaches. As you keep piling on sand, a part suddenly begins to collapse, and when you try to fix the castle by piling on more sand, one side suddenly gives way. Seen mathematically, the pile has reached a critical point where its behavior has become chaotic; avalanches become frequent, and their magnitude fits a power-law curve.
Carreras, Dobson, and Newman wondered if power grids might approach the same kind of critical points as elements are added and power flows increase. They imagined that economic forces and engineering practices seeking to minimize costs and maximize returns on investment in transmission equipment could push system operators to accept higher and higher power levels on their systems, setting the system up for a fall. Feedback from angry politicians and customers would then prompt improvements in the grid, such as construction of additional lines, replacement of faulty relays, or distributed deployment of generators. The short-term result, of course, is to take the system out of its precarious state. But by increasing the system's stability, the improvements would also initiate another cycle of loading.
"You go up near criticality and then you back off a bit because you experience blackouts," explains Dobson. "It's the right thing to do, but the effect is to increase the capability of the system relative to the loading." Since the forces that squeeze more power onto the lines are still present--the pressure to minimize costs and maximize returns--the system is destined to run back to criticality.
To Test This Theory , Dobson and his colleagues took a standard electric power flow model--the sort employed by system planners--and set it in motion, using workstations for the simulation. First, they programmed the model to boost the total load on the lines by 2 percent per year (the North American average) and recalculate the resulting power flows daily. Next, they told the system to knock out a line occasionally, simulating the lightning strikes and other random events that afflict real power lines. In some cases, the recalculated flows would overload neighboring lines, simulating a cascading failure. Finally, they stipulated in the design that every time a blackout occurs, the model "upgrades" the lines involved by boosting their rated capacity.
The resulting distribution of blackouts is statistically equivalent to the post-1984 blackout data collected by the North American Electric Reliability Council. "The system itself finds its own equilibrium near criticality," says Dobson.
Doyle couldn't disagree more. He says the notion of opposing forces pushing power grids into a critical state is so much hocus-pocus, the engineering equivalent of creationism. (Doyle also questions Carreras, Dobson, and Newman's statistical methodology--a disagreement he is pursuing as a peer reviewer on their papers.) Plus, Doyle's less-detailed optimization model for engineering failures can reproduce the historical distribution of large blackouts just as well as the feedback model (better if his arguments on statistical methodology win the day).
And yet even Doyle acknowledges that these two approaches send the same bottom-line message to system planners: major blackouts are a byproduct of a complex system and only fundamental change in the system can extinguish them.
If people like Doyle and Dobson sound cautionary about the prospects for blackout prevention, there is a third school of thought that is downright resigned. Its views have been articulated by a group at Carnegie Mellon University in Pittsburgh and its Electricity Industry Center. Its members include Sarosh N. Talukdar, a power engineer and electrical and computer engineering professor; Jay Apt, an engineering and public policy professor; and Lester B. Lave, a risk assessment expert and economics professor.
In a startling thought piece, "Cascading Failures: Survival Versus Prevention," published in in November 2003, the Carnegie Mellon team argues that if blackouts are as hard to predict and prevent as tsunamis and earthquakes, we should make it our business to be prepared. They argue that the question is not how to prevent blackouts, but how to survive them.
This pragmatic survival thesis begins with the assertion that complex systems--be they power grids or space shuttles--are prone to failure and well-intentioned efforts at prevention can backfire. In the feedback model, for instance, increasing the rating of individual power lines often increases the frequency of large cascading failures, much as the suppression of individual forest fires eventually leads to major conflagrations.
The Carnegie Mellon group argues that the problem with preventing grid failures runs even deeper. The real problem, they say, is the impossibility of testing a potential fix to confirm that it actually decreases the risk of failure. Crash-testing a grid the way one crash-tests a new car is obviously not an option. And the only alternative, simulation, is beyond the reach of current technology for a system as complex as the Eastern Inter-0connection--a system with thousands of generators and tens of thousands of power lines and transformers. Fully assessing just one contingency on the Eastern Interconnection means accounting for more than a billion constraints. Add nonlinear behavior of the sort Thorp models, and the differential equations become unsolvable. "You couldn't get a computer big enough on this planet to go do that," says Apt.
Some of the world's experts in power system dynamics and modeling acknowledge the problem. Experts in western North America, stung by the summer blackouts of 1996 that shut down grids from British Columbia to Mexico's Baja Peninsula, have done more to measure and simulate grid behavior than most. And yet their models regularly come up short, dangerously overestimating the Western Interconnection's ability to damp oscillations during a major outage. "Our simulations are not always realistic," concedes modeling expert Carson Taylor, principal engineer for transmission with the Bonneville Power Administration in Portland, Ore.
Instead of waiting for better dynamic models, the Carnegie Mellon group says that now is the time to begin accommodating blackouts, to do more to empower critical consumers and infrastructure to ride through them. "When you build stuff, it's going to break," says Apt. "The question is: what are the cost-effective things you can do to minimize the consequences?" His answer is: "A lot more than we're doing."
One cost-effective example identified by Apt and his colleagues is to equip traffic signals with energy-efficient light-emitting diodes backed up by batteries [see sidebar, "Better Backups for John Q. Public"]. Such gridlock-defying lights could eliminate a leading cause of death during blackouts while keeping emergency routes clear. And how about elevators that automatically ease down to the nearest floor upon losing power? "Our guess is that if you designed that [capability] into the elevator system originally, it would be all but free," says Lave.
The systems modelers see one more big benefit from greater preparedness: in the strange world of complex systems and unintended consequences, preparing for blackouts might just reduce the frequency of big ones. Carreras posits that utilities might be more willing to disconnect some customers deliberately, or "shed load," when the system is stressed if their customers were prepared for outages. According to the U.S.-Canada report, such load shedding would have confined the 14 August blackout to small patches of Ohio.
Carreras says that simply allowing more small blackouts could have the same effect. He points to the forest fire analogy, where hyperactive firefighting has enabled forests to age and accumulate fuel, laying the foundation for the major conflagrations that have become a summer staple in the western United States. In forest fire models, he says, the simulated firefighters can be programmed to be lazy, and the result is paradoxical: "You lose trees, but you never lose the whole forest," says Carreras.
Accepting The Inevitability Of Blackouts is akin to accepting defeat for many power industry leaders. But considering the deliberate weakening of the power grid is downright treasonous. For the record, Carreras, who is employed by the U.S. Department of Energy, says he does not give advice to policymakers, certainly not about purposely weakening the grid. "Nobody wants to hear that," confides Carreras. "If I say that publicly, people will kill me." So it is not at all surprising that the authors of the U.S.-Canada task force report pay no heed to the possibility that their recommendations to strengthen the grid could have underwhelming impact or unintended consequences.
James W. Glotfelty, director of the U.S. Department of Energy's Office of Electric Transmission and Distribution and a key liaison between the technical and political players on the task force, is unapologetic. He dismisses all the studies that conclude large blackouts are not preventable. His view: "Trim your trees, train your operators, and ensure that your systems work, and the risk of a blackout is greatly reduced. Period." He similarly rejects the Carnegie Mellon team's argument that the limitations of modeling preclude our knowing how to prevent blackouts and that consumers and governments should therefore focus more resources on surviving them.
"If we have the intellectual and computing capability to model nuclear weapons, then we have the ability to do this, too," says Glotfelty. Clark W. Gellings, vice president for power delivery and marketing at the Electric Power Research Institute in Palo Alto, Calif., is equally dismissive of the systems theories. For example, he calls the comparison to firefighting "nonsense." At the same time, neither claims to have spent much time pondering these ideas. "They haven't hit the mainstream yet," says Gellings.
And yet Gellings agrees strongly with one of the ideas: that the grid needs fundamental change. "I agree with the conclusion that you have to change the basic operation of the grid to prevent blackouts." Many senior power engineers are frustrated by the current operation of the grid and are hatching ambitious plans for a major overhaul, he adds. The Electric Power Research Institute has championed the use of electronic power control devices that can massage and control ac power flows--a radical change from today's grid, where only the geography of supply and demand determine how electricity flows through the grid. Some advocate a wholesale shift toward the use of electronically controlled dc power lines to boost capacity for long-range power transfers and simultaneously act as "firebreaks" to contain disturbances cascading along ac power lines.
The problem with these visions for technological redesign is that large-scale investment in transmission is a fantasy in today's turbulent power industry. "If you were silly enough to think about investing in transmission, we would tell you that we don't have any idea how you're going to get reimbursed or how much you're going to get reimbursed," says Lave.
The more immediate problem may be the industry's underinvestment in R and D. It spends just 0.3 percent of revenues on R and D, one of the lowest rates for any industrial sector. "We're beat out easily by the pet food manufacturers," laments Dobson. The comparison between U.S. Department of Energy spending on nuclear weapons research and power system design is less flattering by a long shot.
The first step toward recovery is accepting that one has a problem. The U.S.-Canada report, for all its technical merit, pandered to a desire for quick fixes, perpetuating a sense of denial about blackouts. "I keep hearing claims that we are going to develop technologies to suppress all the blackouts and I find the whole position a bit laughable," says Carreras. "There may be no solution to all of our problems. We don't want to look at that."
Kopell, one of the mathematicians who first applied chaos theory to grid behavior, now directs a biodynamics center at Boston University, having previously won a MacArthur fellowship to study brain neurology. But she still thinks that the power industry and its political supporters need to take a longer view of blackout research and to think more deeply about the grid's propensity for nonintuitive behavior. Call it what you will--systems dynamics, chaos theory, or criticality analysis--Kopell says we're going to need more of it. As she put it, "This work won't immediately give an answer to the problem, but it certainly shows that simple thinking about it isn't adequate."
To Probe Further
"Final Report on the August 14, 2003, Blackout in the United States and Canada: Causes and Recommendations," U.S.-Canada Power System Outage Task Force, April 2004, U.S. Department of Energy, Washington, D.C.
For IEEE Spectrum's take on the August 2003 blackout, and a compendium of background materials from the magazine, go to: /aug03/3536.
For the views of the Carnegie Mellon team, see "Cascading Failures: Survival Versus Prevention," Sarosh N. Talukdar et al., The Electricity Journal, November 2003, pp. 25-31.