Not the First TimeBy John McCormick | Posted 2004-03-04 Email Print
Additional reporting by Berta Ramona Thayer in Panama
As software spreads from computers to the engines of automobiles to robots in factories to X-ray machines in hospitals, defects are no longer a problem to be managed. They have to be pred
That company could just as well be your company, whether you write software in small or large teams; and whether you operate domestically or in multiple nations in a rapidly globalizing economy. You are at risk if you place your product in conditions where human lives are at stake. Indeed, it's not the first time that software has been a suspect in a series of unexpected fatalities.
- In the mid-1980s, poor software design in another radiation machine, known as the Therac-25, contributed to the deaths of three cancer patients. The Therac-25 was built by Atomic Energy of Canada Ltd., which is a Crown corporation of the government of Canada. In 1988, the company incorporated and sold its radiation-systems assets under the Theratronics brand. There does not appear to be any formal investigation of the Therac-25 accidents, but according to an in-depth examination by Nancy Leveson, now a professor at the Massachusetts Institute of Technology, and the accounts of other software experts, the design flaws included the inability of the software to handle some of the data it was given; and the delivery of hard-to-decipher user messages. In a twist of fate, Theratronics, which was ultimately acquired by the Canadian life-sciences company MDS, manufactured the radiation-therapy machine used at the cancer institute in Panama.
- In February 1991, during Operation Desert Storm, an Iraqi SCUD missile hit a U.S. Army barracks in Saudi Arabia, killing 28 Americans. The approach of the SCUD should have been noticed by a Patriot missile battery. A subsequent government investigation found a flaw in the Patriot's weapons-control software, however, that prevented the system from properly tracking the missile. More recently, during Operation Iraqi Freedom, the Patriot missile system mistakenly downed a British Tornado fighter and, according to the Los Angeles Times and other reports, an American F/A-18c Hornet. The pilot in the single-seat Hornet and the two crew members aboard the British jet were killed. The incidents are still under investigation, but Pentagon sources familiar with the Hornet incident told the L.A. Times that investigators were looking at a glitch in the missile's radar system that made it incapable of properly distinguishing between a friendly plane and an enemy missile. Raytheon, the maker of the Patriot missile system, did not want to comment on the 1991 incident. It also said the government was still investigating the more recent incidents and that reports the software may be at fault were "off base."
- A software glitch was cited in a Dec. 11, 2000, crash of a U.S. Marine Corps Osprey tilt-rotor aircraft, in which all four Marines on board were killed. According to Marine Corps Maj. Gen. Martin Berndt, who presented the finding from a Judge Advocate General investigation, "the mishap resulted from a hydraulic-system failure compounded by a computer-software anomaly." A hydraulic line broke in one of the craft's two engine casings as the pods were being moved from airplane mode to helicopter mode in preparation for landing. When the flight-control computer realized the problem, it stopped the rotation of the engine pods. The pilots, trained to respond, tried to reset the pods by pressing the primary reset button, but the finding stated that a glitch caused "significant pitch and thrust changes in both prop rotors," which led to a stall. The plane crashed in a marsh. The craft is made by a partnership of Boeing and Bell Helicopter. A Boeing spokesman said changes were made in the software but referred requests for details about the software anomaly to the government.
A spokesman for the Navy's Air Systems Command, which investigated the incident, confirmed the software problem, but was not able to provide additional details.
Nor are these incidents likely to be the last. In 2002, the Food and Drug Administration (FDA), which oversees medical-device software, said of 3,140 medical-device recalls conducted between 1992 and 1998, 242, or 7.7%, were attributed to software failures. The FDA also says the number of software-related recalls may be underreported because it's often hard to determine the exact cause of a problem in the immediate aftermath of an accident.
There's a financial cost to all organizations that use badly designed and deployed software as well. Poor-quality software costs U.S. businesses $59.9 billion annually, according to a 2002 report from the U.S. Commerce Department's National Institute of Science and Technology (NIST). The NIST study looked not just at the cost of finding and fixing software problems, but also at costs incurred from lost retail transactions and manufacturing product delays.
Those losses are likely to mount as complex software programs are tied together across networks. Think of all the various pieces of corporate data that come together in systems for customer-relationship management, supply-chain management, or enterprise resource-planning-there could be a hundred places where ERP software touches another corporate system, according to Irina Carrel, a senior manager at Mercury Interactive, a company that provides software-testing and -monitoring tools for corporations. And, because of previous bugs, computer-program anomalies or other factors, it's impossible to predict what exactly will happen when two pieces of code come into contact.
"Software is the most complicated thing that the human mind can come up with and build," says Gary McGraw, the chief technology officer at Cigital, a consultancy specializing in improving software quality. "Perfection is unobtainable." (See Cigital: Bug Zappers, A Dossier, p. 54, Baseline, March 2004)
The medical-device software market is becoming a particular area of concern. The FDA says about half of the 10,000 medical devices on the U.S. market are software-driven-everything from pacemakers to infusion pumps to radiation-therapy machines. FDA watchers say many of the companies developing medical-device software are small. Because of the amount of research-and-development money that goes into medical devices, companies are under pressure to get products out the door.
"We will see more problems" in the medical-device field, says Alan Kusinitz, managing partner of SoftwareCPR, a consulting company that specializes in medical-device software. One of his biggest worries is the ever-increasing number of networked medical devices. Independently, software might function normally, but when connected to code in other machines, it may act unpredictably.
"It's the abnormal stuff that always shows up later in weird circumstances," Kusinitz says. "That's most often where safety problems occur."
There are defense and industrial efforts underway by organizations such as the Sustainable Computing Consortium (SCC) and the Software Engineering Institute, both located at Carnegie Mellon University, to foster programs and standards to reduce software defects. There are also organizations in the healthcare industry, such as the Association for the Advancement of Medical Instrumentation, that are trying to establish standards for software used in medical devices. In addition, new testing tools and services, such as Software Development Technologies' ReviewPro, which examine not just the code but the methodology behind the code, are starting to offer software professionals assistance in vetting their output, as they create it. Also, code-writing practices such as "agile programming" emphasize breaking big projects into small pieces-and getting early and repeated input from users before proceeding.