|
Examples of Soft CSPFs
These are not just academic issues. Here are some
practical examples of soft CSPFs:
(1) If an airplanes anti-icing system fails,
ice can build up, which causes the airplane to lose
lift but only during the 5% or so of flights
that encounter atmospheric icing conditions.
(2) A short circuit can cause a spark in a fuel tank,
which can cause it to explode but only if
the fuel-air mixture in the tank is in a flammable
(meaning primarily a rich enough) condition, which
may happen only 1% of the time.
(3) If an uncontained engine failure occurs on one
engine on a two-engine airplane, a fragment can damage
the other engine but only if the direction
of the fragment is within a particular small (e.g.,
3-degree) angle.
(4) All propulsive thrust can be lost if an airplanes
engines are starved of fuel but that will
cause a catastrophic accident only if the airplane
is far enough from a suitable airport that the crew
cannot glide to a successful landing. (Airplanes
have been able to do that successfully in a surprisingly
high percentage of cases. Nevertheless, to my knowledge
at least, that benign possibility has always been neglected in probabilistic analyses.)
Many more examples of soft CSPFs could be cited. In
each case, only one primary material failure is involved,
but the probability of a resulting catastrophic accident
is significantly reduced by the necessity of either
an unfavorable condition or an additional failure
with a probability substantially below 100%.
Whether soft CSPFs should fall under the same prohibitions
as hard CSPFs depends primarily on the value of the
conditional probability (that is, the probability
given the failure of interest) of the additional event
or events required to cause a catastrophic accident.
There is no generally agreed-upon threshold at which
this probability should be assumed to be the worst-case
100%. The determination must depend on a judgment
about whether the conditional probability is sufficiently
lower than 100% to bother taking credit for the difference.
If the probability is 90%, it may not be, and the
failure of interest can be allowed to fall under the
CSPF prohibition. But if the probability is 1%, it
probably is worth trying to take credit for that,
arguing that the CSPF prohibition shouldnt apply,
and proceeding with a probabilistic risk analysis.
Incidentally, similar prohibitions have sometimes
been encouraged or imposed on so-called "latent"
failures, meaning failures that can remain undetected
for a long time. Nobody is saying that such failures
are a good thing, but to simply assume that all possible
undetectable failures already exist can also lead to extremely
pessimistic analysis results. As is the case for CSPFs,
sound general risk-analysis principles and criteria
should preclude the need for "special case"
assumptions for undetectable failures.
Conclusion
Prohibitions against catastrophic single-point failures
are not based upon either the probabilities of or
the expected losses caused by those failures being
essentially different than for multiple failures.
Rather, those prohibitions are based upon greater
levels of uncertainty (usually in the minds of government
regulators and contracting agencies) that claimed
catastrophic single-point failure probabilities and
the expected-loss estimates resulting from those probabilities
will be correct.
Ideally, the same general principles, including uncertainty
considerations, should be applied to single-point
failures (catastrophic or not) as to multiple-point
failures. However, that can come about only if risk
analysts and managers become more aware and respectful
of the uncertainty facet of risk. As a practical matter,
in the case of catastrophic single-point failures,
that will mean recognizing uncertainties as part of
the process of determining probabilities and
appropriately adjusting those probabilities upward
as uncertainties increase.
I have to say, however, that I suspect that the catastrophic
single-point failures prohibition will be around for
a long time to come. Perhaps the best we will be able
to do in the near term will be to try to prevent it
from causing too many unrealistic and wasteful decisions.
About the Author
Ted Yellman has 43 years of experience in system
design, reliability, components, safety, regulatory
engineering, and engineering-assurance management
in several aerospace and electronics companies.
His primary interests are in the areas of risk
criteria, risk analysis and risk management.
Yellman is the originator of Event-Sequence Analysis.
Offering an alternative to fault-tree and Markov-chain
analyses, this method makes it easier to understand
and account for events that may have a common cause
or that may cause one another.
Article References
1. Yellman, Ted W. "Failures and Related Topics,"
IEEE Transactions on Reliability, Vol. 48,
No. 1, March 1999, pp. 6-8.
2. System Design and Analysis. Federal Aviation
Administration Advisory Circular (AC) 25.1309-1A,
June 21, 1988.
3. System Design and Analysis. Draft for Advisory
Joint Material AC/AMJ No. 25.1309, November 23, 1997.
4. Code of Federal Regulations, Title 14 (Aeronautics
and Space), Part 25 (Airworthiness Standards: Transport
Category Airplanes), January 1, 2000 Revision.
5. Clemens, Pat L. "From Our Readers," Journal
of System Safety, Q3 2002, p. 5.
6. Yellman, Ted W. "The Three
Facets of Risk," SAE/AIAA 2000 World Aviation
Conference, San Diego, October 10-12, 2000.
7. Yellman, Ted W. "Redundancy Killers,"
Proceedings of the SAE 1998 Advances in Aviation Safety
Conference, April 6-8, 1998.
|