President's Message From the Editor's Desk TBD In the Spotlight: Risk Communication Non-Coherent Fault Trees Can Be Misleading Chapter News Technology Corner Mark Your Calendar Clif's Notes Opinion About this Journal Classifieds Advertising in eJSS Contact Us Puzzle

Volume 42, No. 3 May-June 2006
Clif's Notes

A Short History of System Safety

From the beginning of mankind, safety seems to have been an inherent human genetic element or force. The Babylonian Code of Hammurabi states that if a house falls on its occupants and kills them, then the builder shall be put to death. The Bible established a set of rules for eating certain foods, primarily because these foods were not always safe to eat, given the sanitary conditions of the day. In 1943, the famous psychologist Abraham Maslow proposed a five-level hierarchy of basic human needs, and safety was number two on his list.

System safety is a specialized extension of our drive for safety. The system safety concept was not the invention of any one person; rather it was a call from the engineering community, contractors and the military to design and build safer systems and equipment by applying a formal proactive approach. This new safety philosophy involved using safety engineering technology, combined with lessons learned. It was an outgrowth of the general dissatisfaction with the fly-fix-fly approach to design (i.e., fix safety problems after a mishap has occurred) prevalent at that time. System safety as we know it today began as a grass-roots movement that was introduced in the 1940s, gained momentum during the 1950s, became established in the 1960s and formalized its place in the acquisition process during the 1970s.

The first formal presentation of system safety that I have been able to identify was by Amos L. Wood at the 14th Annual Meeting of the Institute of Aeronautical Sciences (IAS) in New York in January of 1946. In a paper titled, "The Organization of an Aircraft Manufacturer's Air Safety Program," Wood emphasized such new and revolutionary concepts as:

  • Continuous focus on safety in design
  • Advance and post-accident analysis
  • Safety education
  • Accident preventive design to minimize personnel errors
  • Statistical control of post-accident analysis

Wood's paper was referenced in another landmark safety paper by William I. Stieglitz titled, "Engineering for Safety," presented in September of 1946 at a special meeting of the IAS and finally printed in the IAS Aeronautical Engineering Review in February of 1948. Stieglitz's far-sighted views on system safety are evidenced by the following quotations from his paper:

"Safety must be designed and built into airplanes, just as are performance, stability, and structural integrity. A safety group must be just as important a part of a manufacturer's organization as a stress, aerodynamics, or a weights group...."

"Safety is a specialized subject just as are aerodynamics and structures. Every engineer cannot be expected to be thoroughly familiar with all developments in the field of safety any more than he can be expected to be an expert aerodynamicist."

"The evaluation of safety work in positive terms is extremely difficult. When an accident does not occur, it is impossible to prove that some particular design feature prevented it."

The need for system safety was often motivated by the analysis and recommendations resulting from accident investigations. For example, on May 22, 1958, the Army experienced a major accident at a NIKE-AJAX air defense site near Middletown, New Jersey, that resulted in extensive property damage and loss of lives of Army personnel. The review committee recommended that safety control through independent reviews and a balanced technical check be established, and that an authoritative safety organization be established to review missile weapon systems design. Based on these recommendations, a formal system safety organization was established at Redstone Arsenal in July of 1960, and AR 385-15, "System Safety" was published in 1963. The USS Oriskany explosives mishap on October 26, 1966, and the USS Forrestal explosives mishap on July 29, 1967, motivated new safety programs and concepts for Navy weapon systems. The Apollo 1 fire on January 27, 1967, initiated new system safety practices within NASA.

The Air Force was an early leader in the development of system safety. In 1950, the USAF Directorate of Flight Safety Research (DFSR) was formed at Norton Air Force Base in California. It was followed by the establishment of safety centers for the Navy in 1955 and the Army in 1957. In 1954, the DFSR began sponsoring Air Force-industry conferences to address safety issues of various aircraft subsystems by technical and safety specialists. In 1958, the first quantitative system safety analysis effort was undertaken on the Dyna-Soar X-20 manned space glider.

The early 1960s saw many new developments in system safety. In July of 1960, a system safety office was established at the USAF Ballistic Missile Division (BMD) in Inglewood, California. BMD facilitated both the pace and direction of system safety efforts when it published, in April of 1962, the first system-wide safety specification titled, BSD Exhibit 62-41, "System Safety Engineering: Military Specification for the Development of Air Force Ballistic Missiles." The Naval Aviation Safety Center was among the first to become active in promoting an inter-service system safety specification for aircraft, BSD Exhibit 62-82, modeled after BSD Exhibit 62-41. In the fall of 1962, the Minuteman program director, in another system safety first, identified system safety as a contract deliverable item in accordance with BSD Exhibit 62-82.

The first formal system safety program plan (SSPP) for an active program was developed by the Boeing Company in December of 1960 for the Minuteman program. The first military specification for safety design requirements, MIL-S-23069, "Safety Requirements, Minimum, Air Launched Guided Missiles," was issued by the Bureau of Naval Weapons on October 31, 1961.

In 1963, the Aerospace System Safety Society (now the international System Safety Society) was founded in the Los Angeles area. In 1964, the University of Southern California's Aerospace Safety Division began a Master's degree program in Aerospace Operations Management, from which specific system safety graduate courses were developed. In 1965, the University of Washington and the Boeing Company jointly held the first official System Safety Conference in Seattle, Washington. By this time, system safety had become fully recognized and institutionalized.

Presently, the primary reference for system safety is MIL-STD-882, which was developed for DoD systems. It evolved from BSD Exhibit 62-41 and MIL-S-38130. BSD Exhibit 62-41 was initially published in April of 1962 and again in October of 1962; it first introduced the basic principles of safety, but was narrow in scope. The document applied only to ballistic missile systems, and its procedures were limited to the conceptual and development phases "from initial design to and including installation or assembly and checkout." However, for the most part, BSD Exhibit 62-41 was very thorough. It defined requirements for systematic analysis and classification of hazards, as well as the design safety precedence used today. In addition to engineering requirements, BSD Exhibit 62-41 also identified the importance of management techniques to control the system safety effort. The use of a system safety engineering plan and the concept that managerial and technical procedures used by the contractor were subject to approval by the procuring authority were two key elements in defining these management techniques.

In September of 1963, the USAF released MIL-S-38130. This specification broadened the scope of our system safety effort to include "aeronautical, missile, space, and electronic systems." This increase of applicable systems and the concept's growth to a formal military specification (Mil-Spec) were important elements in the growth of system safety during this phase of evolution. Additionally, MIL-S-38130 refined the definitions of hazard analysis. These refinements included system safety analyses: system integration safety analyses, system failure mode analyses and operational safety analyses. These analyses still resulted in the same classification of hazards, but the procuring activity was given specific direction to address catastrophic and critical hazards.

In June of 1966, MIL-S-38130 was revised. Revision A to the specification once again expanded the scope of the system safety program by adding a system modernization and retrofit phase to the conceptual phase definition. This revision further refined the objectives of a system safety program by introducing the concept of "maximum safety consistent with operational requirements." On the engineering side, MIL-S-38130A also added another safety analysis: the Gross Hazard Study (now known as the Preliminary Hazard Analysis). This comprehensive qualitative hazard analysis was an attempt to focus attention on safety requirements early in the concept phase and was a break from other mathematical precedence. But changes were not just limited to introducing new analyses; the scope of existing analyses was expanded as well. One example of this was the operating safety analyses, which would now also include system transportation and logistics support requirements. The engineering changes in this revision weren't the only significant changes. Management considerations were highlighted by emphasizing management's responsibility to define the functional relationships and lines of authority required to "assure optimum safety and to preclude the degradation of inherent safety." This was the beginning of a clear focus on management control of the system safety program.

MIL-S-38130A served the USAF well, allowing the Minuteman program to continue to prove the worth of the system safety concept. By August of 1967, a tri-service review of MIL-S-38130A began to propose a new standard that would clarify and formalize the existing specification, as well as provide additional guidance to industry. By changing the specification to a standard, there would be increased program emphasis and accountability, resulting in improved industry response to system safety program requirements. Specific objectives of this rewrite included obtaining a system safety engineering plan early in the contract definition phase, and maintaining a comprehensive hazard analysis throughout the system's life-cycle.

In July of 1969, MIL-STD-882 was published; it was titled, "System Safety Program for Systems and Associated Subsystems and Equipment: Requirements For." This landmark document continued the emphasis on management, and continued to expand the scope of system safety that would apply to all military services in the DoD. The full life-cycle approach to system safety was also introduced at this time. The expansion in scope required a re-working of the system safety requirements. The result was a phase-oriented program that tied safety program requirements to the various phases consistent with program development. This approach to program requirements was a marked contrast to earlier guidance, and the detail provided to the contractor was greatly expanded. Since MIL-STD-882 applied even to small programs, the concept of tailoring was introduced and allowed the procuring authority some latitude in relieving some of the burden of the increased number and scope of hazard analyses. Since its advent, MIL-STD-882 has been the primary reference document for system safety.

The basic version of MIL-STD-882 lasted until June of 1977, when MIL-STD-882A was released. The major contribution of MIL-STD-882A centered on the concept of risk acceptance as a criterion for system safety programs. This evolution required introduction of hazard probability and established categories for frequency of occurrence to accommodate the long-standing hazard severity categories. In addition to these engineering developments, the management side was also affected. The responsibilities of the managing activity became more specific as more emphasis was placed on contract definition.

In March of 1984, MIL-STD-882B was published, and it contained a major reorganization of the A version. Again, the evolution of detailed guidance in both engineering and management requirements was evident. The task of sorting through these requirements was becoming complex, and more discussion on tailoring and risk acceptance was expanded. More emphasis on facilities and off-the-shelf acquisition was added, and software was addressed in some detail for the first time. The addition of Notice 1 to MIL-STD-882B in July of 1987 expanded software tasks and the scope of the treatment of software by system safety.

In January of 1993, MIL-STD-882C was published. Its major change was to integrate the hardware and software system safety efforts. The individual software tasks were removed, so that a safety analysis would include identifying the hardware and software tasks together in a system. In January of 1996, Notice 1 was published to correct some errors and to revise the Data Item Descriptions for more universal usage.

In the mid-1990s, the DoD acquisition reform movement began, along with the Military Specifications and Standards Reform (MSSR) initiative. These two movements led to the creation of a standard practice for system safety in MIL-STD-882D, released in February of 2000. Under acquisition reform, program managers are to specify system performance requirements and leave the specific design details up to the contractor. In addition, the use of military specifications and standards will be kept to a minimum. Only performance-oriented military documents are permitted. Other documents, such as commercial item descriptions and industry standards, are to be used for program details. MIL-STD-882 was considered to be important enough that it was allowed to continue, as long as it was converted to a performance-oriented military standard. Until MIL-STD-882D was published, the DoD standardization community continued to allow the use of MIL-STD-882C, but a Department of the Navy (DON) waiver allowed its use by DON program managers. Conversely, a contractor could freely use the standard without any waivers. Once MIL-STD-882D was published as a DoD Standard Practice in February of 2000, its use did not require any waivers.


Fault Tree Analysis History

In 1961, H. A. Watson and A. B. Mearns of Bell Laboratories conceived the Fault Tree Analysis (FTA) concept while performing a safety study of the Minuteman Launch Control System for the U.S. Air Force. The purpose of their study was to demonstrate a safe launch control system design; however, it evolved into a methodology for accomplishing this objective. Dave Haasl, then at the Boeing Company on the Minuteman program, recognized the value of FTA as an overall system safety tool; he led a team that applied FTA to the entire Minuteman Missile System. The Minuteman program used FTA to evaluate such undesired events as inadvertent programmed launch and inadvertent motor ignition to quantitatively demonstrate that the design provided acceptable risk levels for these potential mishaps.

The commercial aircraft division of Boeing saw the results from the Minuteman program and quickly began using FTA during the design of commercial aircraft. In 1965, Boeing and the University of Washington sponsored the first System Safety Conference. At this conference, the first-ever papers were presented on FTA, marking the beginning of worldwide interest in the subject.

In 1966, Boeing developed a computer fault tree simulation program called BACSIM (Boeing Aerospace Corporation Simulation) for the evaluation of multi-phase fault trees. BACSIM could handle up to 12 operational phases, and included the capability for repair and K-factor adjustment of failure rates. The BACSIM code was developed by Bob Schroeder and Phyllis Nagel of Boeing. Bob Schroeder also developed a computer code that plotted fault trees on a Calcomp 26-inch wide roll plotter. Both programs ran on an IBM 370 mainframe. These were specialized Boeing in-house programs, which few people were aware of outside the company.

Following the lead of the aerospace industry, the nuclear power industry discovered the virtues and benefits of FTA, and began using the tool in the design and development of nuclear power plants. Many key individuals in the nuclear power industry contributed to advancing fault tree theory and fault tree algorithms and computer codes. In fact, the nuclear power industry may have contributed more to the development of FTA than any other user group. Many new evaluation algorithms were developed, along with software implementing these algorithms, such as MOCUS, Prepp/Kitt, SETS, FTAP, Importance and COMCAN.

FTA has also been applied to the chemical process industry, the auto industry, rail transportation, launch vehicles and spacecraft, the aerospace industry and the robotics industry, just to name a few. There are probably many other industries and disciplines using FTA that have not been mentioned here. One of the more recent important events in the FTA methodology has been the development of commercial fault tree construction and evaluation software that operates on personal desktop computers. This has provided great flexibility and utility for the safety analyst.


Software Safety History

The software safety aspect of system safety began in the early 1970s. In 1974, I performed my first software safety analysis at Boeing on the B-1A Offensive Avionics (Weapon) System. The analysis was a Contract Data Requirement List item contracted for by the Air Force.

The earliest citations I have located for anything resembling software safety include the following:

  1. "The Role of System Safety in Software," R. T. LeBon & T. L. Fagan, AIAA Aerospace Computer Systems Conference, September 1969, pages 1-5.
  2. "Hazard Analysis for Software Systems," O. C. Lindsey, Third International System Safety Conference, October 1977, pages 907-918.

The first formal technical papers on software safety were presented at the Fifth International System Safety Conference in 1981. It was at this point that software safety became an officially recognized term relating to the discipline and the methodology. The software safety papers at this conference included:

  1. "Software System Safety," E. S. Dean Jr., Fifth International System Safety Conference, July 1981, pages A1-A8.
  2. "Software and System Safety," C. A. Ericson II, Fifth International System Safety Conference, July 1981, pages B1-B11.
  3. "A Method of Software Safety Analysis," J. G. Grigg II, Fifth International System Safety Conference, July 1981, pages D1-D18.
  4. "Software Safety from a Software Viewpoint," N. G. Leveson, Fifth International System Safety Conference, July 1981, pages E1-E20.

Since 1981, software safety has become a burgeoning field, replete with a multitude of technical papers, articles, guidance documents and books on the subject. Dr. Nancy Leveson later became a subject-matter expert on software safety, and her research while at the University of Washington helped promulgate the software safety concept through the many well-articulated technical articles she produced.

Much of this historical background was combined from information obtained from the following sources:

  1. Air Force System Safety Handbook, 2000.
  2. MIL-HDBK-764, System Safety Engineering Design Guide for Army Materiel, 1990.
  3. NAVSEA SW020-AH-SAF-010, Weapon System Safety Guidelines Handbook, 2005.
  4. NAVORD OD 44942, Weapon System Safety Guidelines Handbook, 1973.
  5. "Fault Tree Analysis - A History," by Clifton A. Ericson II, Proceedings of the 17th International System Safety Conference, 1999, pages 87-96.

Please send me your comments and let me know if you agree or disagree with this historical background. There is probably much important history that can be added to the knowledge presented here.

Regards,
Clif


Copyright © 2006 by Clifton A. Ericson II. All rights reserved.