The massive outage that forced Delta to cancel thousands of flights this week showed how airlines face a relatively new vulnerability that could only get worse — an aging, complex web of computers that control every aspect of their operation.
When the systems fail, they can bring entire airlines such as Delta to a grinding halt — whether it's the ability to sell tickets or schedule aircraft crews.
Where once software was used primarily to book flights and issue tickets, today it's a matrix of overlapping, often disjointed systems that interact with mobile apps, track loyalty awards and help the airline industry bring in billions of dollars through the sale of perks like extra leg room. That growing complexity makes for hiccups, and they are difficult to avoid.
Some of the systems, such as Delta's, are built on top of systems that are decades old. For instance, Delta’s reservation and passenger service system is multilayered, built on a 52-year-old program called Deltamatic.
“It’s a ‘Mad Men’-era computer system,” says Henry Harteveldt, an analyst with the travel industry company Atmosphere Research Group.
By Thursday, Delta had largely recovered from the outage that caused it to cancel or delay more than 4,000 flights, the biggest disruption to the airline's operations since the terror attacks of Sept. 11, 2001. The fact that Delta, known for its on-time reliability, took days to get back on track illustrates the intricacy at play when so many systems have to work together.
“They genuinely are a very reliable carrier, and this is a tremendous aberration which came as a surprise to everybody," says Daniel Baker, CEO of flight tracking site FlightAware. “I think it’s a combination of an extremely interconnected and complex system. And bad luck."
Delta’s technical meltdown was just the most recent to leave passengers in the lurch:
•Southwest. In July, Southwest's website was knocked offline, and the carrier canceled roughly 1,850 flights over three days after a router powered by old technology failed and alternate systems didn't properly kick in.
•JetBlue. Passengers encountered flight delays in January when there was a loss of power at a data center used by the airline.
•American. The airline cited connectivity issues when it briefly suspended flights last September at Miami, Chicago O'Hare and Dallas/Fort Worth airports.
•United. After incurring a series of glitches since merging with Continental in 2010, United temporarily grounded a large number of its flights in both June and July of last year because of technical difficulties.
Don't expect reliability to improve anytime soon.
“This is not the end to these sorts of problems," Baker says. “It’s not like the airline can say, ‘We’ll invest in this, and by Christmas we’ ll guarantee reliability.' These are multiyear endeavors ... (And) in general, the airlines are like a wristwatch. Every little piece has to work perfectly or it all falls apart."
In a videotaped message Aug. 9, Delta CEO Ed Bastian apologized for this week's disruption and said that “over the last three years, we’ve invested hundreds of millions of dollars in technology, infrastructure upgrades and systems, including backup systems to prevent what happened yesterday from occurring."
However, Bastian has since said that roughly 300 of Delta’s 7,000 servers were not linked to an alternate power source. When a faulty piece of power control equipment caught fire Monday, sparking a surge that knocked out power, servers that did have back up were unable to communicate with those that did not, taking down Delta's whole system.
"Our infrastructure is dated, no question," Bastian told The Atlanta Journal Constitution, but "I don't think that was the problem."
Still, before this week’s incident, Delta had already brought on board a new executive to oversee its technology and help outline next steps.
The airline declined an interview request with Bastian.
While the basic foundation of many airline systems has been in use for decades, complexity, not age, is the real problem, says Lance Sherry, director of the Center for Air Transportation Systems Research at George Mason University.
“So many systems are layered on top of each other that we don’t always know who’s talking to whom,” Sherry said.
Sherry says airlines run multiple, intersecting systems which must flawlessly interact with each other. In the industry, there are at least six — ticketing reservation, aircraft assignment, flight crew scheduling, airport gate assignment, air traffic flow management and irregular operations systems.
Often they come from different vendors and use different software languages. And yet they must be synchronized, and timing is split-second and critical.
“It’s like a ballet, where the ballerina is thrown up by one dancer but another one has to catch her coming down,” Sherry says.
The three main players in the business — Sabre, Amadeus and Hewlett Packard — all are working on integrated systems, he says, but that process takes time and can be delayed by factors as routine as airline purchase cycles or as fundamental as a carrier’s financial health.
A systems upgrade would likely cost an airline at least $75 million, according to Harteveldt. But airlines have been flush with profits in recent years, thanks to plunging fuel costs and a disciplined matching of seats to passenger demand. Last year was the industry's biggest moneymaker at least since deregulation in 1978, with airlines reporting $25 billion in profits — monies that they poured, in part, into stock buybacks and dividends for investors.
Airlines stocks have lagged so far this year. The S&P airline index is down 18.3% through Aug. 5 vs. a 6.8% increase for the S&P 500. But the airline industry index outperformed the S&P 500 on a five-year compound annual growth base, increasing 26% as compared to 12.4%, according to Jim Corridore, an analyst with S&P Global Market Intelligence.
Even when airlines are willing to fork over the money to upgrade computers, there's another obstacle: the need to make improvement on systems that need to remain running around the clock every day of the year. A company can’t just stop operations for four days while it installs a new system. Additionally, global regulatory requirements make full-scale overhaul difficult.
Still, for stranded, frustrated passengers, it is hard to understand why airlines can’t install more efficient technology.
“It’s amazingly vexing, especially if you’re one of the passengers caught in this latest Delta fiasco,’’ says Charlie Leocha, president of the advocacy group Travelers United. It is “mind boggling that one switch in one room at Delta headquarters can shut down the entire ... system."
The industry should start plowing some of its profits into improvements that would fend off such disruptions, he says.
Airlines say they have been doing exactly that. United, which chose to adopt Continental's reservations system after the two carriers merged in 2010, will spend $500 million on technology this year, according to Luke Punzenberger, a United spokesman.
American Airlines spokeswoman Martha Thomas said the connectivity issue that led to a ground stop at three of its hubs last September and the delay or cancellation of 297 flights was "a one-time event" that was quickly identified and repaired. She added the carrier "is constantly upgrading and looking for ways to improve our systems and technology for customers and employees."
And Southwest, which had to cancel hundreds of flights in July after a router malfunctioned, says it "will continue updating, enhancing, replacing and/or modernizing our software systems and technologies," spokesman Dan Landson says.
The low-cost carrier, which currently relies on a separate reservations system for international bookings, is in the midst of transitioning to a single platform which should be operational by the end of next year. The cost is estimated to be roughly $500 million.
Other factors, beyond actual software, also need to be addressed by airlines to allow them to avoid, or more quickly recover from, technological glitches. Delta created a graphical interface to make the system more user-friendly and roughly a year ago stopped training customer service and reservation representatives on the older version. When Delta’s computer networks came back online this week, only the older Deltamatic interface was working at some sites. “So all the agents who couldn’t use Deltamatic were useless,” Harteveldt said.
Harteveldt adds Delta closed a second data center in 2009 that could have proved a backup to the Atlanta center which had problems this week. And the airline did not appear to have a business continuity plan.
“It’s inexcusable,” he says.
About the Author