|
|
|
|
|
|

|
| Subjectivity
in Hazard Analysis
|
| by Felix
Redmill
London, U.K.
|
|
Page:
1 | 2
| 3
|
|
Likelihood
Likelihood may be determined quantitatively (as
a probability) or qualitatively. Qualitative analysis
is by definition approximate, but quantitative
analysis is often assumed to be wholly objective.
Yet there is considerable subjectivity in the
analysis process. In spite of the appearance of
accuracy, quantitative analysis is subject to
assumptions that are not always made explicit.
One is that of randomness. Yet, in many modern
systems, and particularly those in which software
and humans are involved, the assumption is invalid.
Unlike mechanical and electromechanical components,
software does not wear out. Its failures are not
random but systematic and would be repeated if
the triggering circumstances recurred, so even
a history of use and failure is not a firm basis
for prediction. Forecasts of the rate of software
failure should only be made with great care.
Similarly, the behavior of humans cannot be assumed
to be random. Yet many models for human reliability
assessment (HRA) are probabilistic. However, although
probabilities are derived, the approach taken
in most cases is based principally on human judgment.
The results are at best reasonable approximations,
and at worst wild guesses, but always they include
considerable subjectivity.
Even for hardware, whose failure may indeed be
random, it is questionable to what extent the
derived probabilities are truly representative.
Some are of incredible magnitude and, though stated
to several decimal places and based on elaborate
calculations, may contain huge errors because
of the omission of failure modes from the model
on which the calculations are based. Results then
are representative of the model but not of reality,
as was highlighted by the failure of the British
nuclear submarine Tireless in 2000.
The problem was a crack in a cooling-system pipe.
Such a crack had not been considered in the probabilistic
analysis of submarine failure, and (perhaps consequently)
the pipe was never checked during maintenance.
By implication, the occurrence of such a crack
was considered to be incredible. Yet, when the
crack occurred on Tireless, and checks of the
other submarines in the fleet were made, seven
of the twelve were found to have indications of
similar cracking, and the other five were not
wholly exonerated [Ref. 4]. The calculated probability
of failure was off by several orders of magnitude,
with huge detrimental implications for national
defense. Subjective assumptions and omissions
through human judgment or negligence can have
enormous implications for the accuracy and relevance
of probabilistic calculations. Failure or accident
may occur for reasons not considered in the analysis.
Precision is not the same as accuracy and should
not be assumed to imply it.
Another issue is statistical inference, the reliance
on historic data for the prediction of future
events. Some analyses rely on inadequate or inappropriate
data and are therefore wrong. But even when the
data is statistically valid, it is important for
the conditions in which the data is to be applied
to be the same as (or adequately similar to) those
in which it was collected. Further, historical
results often rest on crucial conditions that
are unnoticed, unrecorded or unrepeatable, so
using them as the basis of prediction means applying
subjective judgment. Suppose, for example, that
a component of risk in system operation is operator
error. Suppose that a company's plan to reduce
its operating budget involves the recruitment
of less qualified staff and a cutback on training.
The error frequency prior to the cutback would
be expected to be lower than after it. It would
not be a valid predictor of future performance
because of the change in conditions. Past history
is an unreliable guide to the future if consistency
of conditions is not guaranteed.
Even when the historic data is extensive and the
conditions remain constant, the past frequency
only tends to become an accurate predictor (in
theory) as the time of observation tends towards
infinity. This is the law of large numbers. Now,
what may be taken as a reasonable approximation
of infinity depends on the application, and in
some cases may be quite small, but care is required
in making assumptions about the predictive value
of historic data. Many risk-takers have suffered
huge losses in casinos because they have implicitly
assumed that the law of large numbers applies
to small numbers.
When historic data is not available, as in the
case of new technologies and products, mathematical
models are often devised for assessing probabilities,
and these carry assumptions. For example, if a
new drug is found to (or not to) induce cancer
in mice but cannot be tested in large doses on
humans for ethical reasons, how valid is a projection
of its effect on mice to its effect on humans?
Assumptions must be made. And it is usual for
the judgments of experts to differ, for they depend
on the assumptions made about the relationship
between the observed effect and the administered
dose.
At a recent conference on risk, one paper presented
a mathematical model, based on probabilistic equations,
for determining the likelihood of certain hazardous
events. At the end of the talk, a delegate referred
to the presenter's statement that in some cases
there was very sparse data for input into the
equations, and he asked what was done in such
cases. The presenter dismissed the question. "That
is not a problem," he replied. "When
there is sparse data, we just employ an expert
to provide an opinion." The mathematician
neglected to wonder what the expert based his
opinion on if historic accident data did not exist.
A computing acronym, GIGO (garbage in, garbage
out), is also appropriate to mathematical risk
models, but the results of risk analyses are often
taken to be accurate and the assumptions and inaccuracies
in their derivation unrecorded and forgotten.
The preceding discussion has concerned quantitative
risk analysis. In qualitative analysis techniques,
subjectivity is an integral and obvious part of
the process.
|
|
"Attaching probabilities
to fault trees assumes not only randomness...of
individual events (e.g., component failures) but
also independence of events from common causes.
But independence may not apply...."
|
|
The Use of Fault Trees
The previous two sections have shown that there
is considerable subjectivity in the estimation
of the two components of risk, consequence and
likelihood. But what about the techniques used
in combining them? In the previous article [Ref.
1], the subjectivity in bottom-up techniques,
such as HAZOP, was highlighted. In this section,
fault tree analysis, a top-down technique, is
examined.
A fault tree is a cause-and-effect network. It
starts with a final hazardous event and works
backwards logically to ultimate causes. The ways
in which events combine to cause higher-level
events are represented in the tree by AND and
OR logical units, or gates (see Figure 1). If
the occurrence of any of a number of events would
cause a higher-level event, the causal events
are combined by an OR gate; if a given result
would require the occurrence of two or more events,
these are combined by an AND gate. When probabilities
are attached to the tree, their combinations can
be calculated all the way up to the top level.
Those combined by an OR gate are added, and those
combined by an AND gate are multiplied.

Fault
tree analysis was developed in the field of reliability
theory, where, typically, systems comprise known
and identifiable subsystems and components. In
such cases, a fault tree may represent a system
directly. Traditionally, too, components were
predominantly mechanical or electromechanical,
and their fault histories allowed probabilistic
determination of the top event. FTA is therefore
often thought of as an objective technique.
Attaching probabilities to fault trees assumes
not only randomness (already discussed above)
of individual events (e.g., component failures)
but also independence of events from common causes.
But independence may not apply, often for unrecognized
reasons such as the derivation of power from a
single supply or the control of several components
by a single operator.
The extension of fault tree analysis to situations
of uncertainty, such as policy decisions and safety
scenarios, means that the trees are subjectively
constructed rather than modeled to reflect directly
the combinations of components in a system. This
introduces judgment as to what should be included
in the tree and the possibility of omissions because
of ignorance or error.
Fischhoff, Slovic and Lichtenstein [Ref. 5] have
shown that the construction and use of fault trees
are subject to variability. They point out that,
during construction, such questions arise as:
Which faults should be identified separately and
which should be lumped under "others"?
Which items should be grouped together? What sort
of graphic display should be used? What level
of detail is most appropriate? Answers depend
on a number of factors, including the purpose
of the analysis and how much of an effect each
branch of the tree or each component is thought
to have. Thus, the construction of a fault tree
is subjective, and a tree prepared for a given
purpose by one person is unlikely to be reproduced
by another under the same circumstances.
Omissions of relevant pathways in a fault tree
are possible because of ignorance, poor memory,
and lack of imagination, among other causes, and
such omissions can lead to understatement of the
relevant probabilities. Fischhoff, Slovic and
Lichtenstein concluded that in the creation of
a fault tree, humans are likely to be biased in
favor of information readily available to them
(the availability bias), and that when omissions
do occur, people are, in the main, insensitive
to them. In their study, this was found to be
the case not only in tests of groups of college
students but also of groups of experts (car mechanics).
Such insensitivity to omission occurred in the
case of the submarine Tireless, discussed above.
Fischhoff, Slovic and Lichtenstein also found
that the perceived importance of a particular
branch of a fault tree was increased if it was
represented in pieces (i.e., as two separate component
branches). Thus, the probabilities estimated by
experts were dependent on the construction of
the tree, which in turn was subject to human frailties
and biases.
Then, in the absence of reliable historic data,
at least some probabilities are likely to be derived
from sources of "low pedigree." Funtowicz
and Ravetz [Ref. 6] show that a system's probability
of failure may be dependent on the combination
of items of information provided by various persons.
One source, they say, may be historic failure
data, from which a reliability estimate may be
derived (see the second layer of Figure 1). They
refer to this source as being of high pedigree.
But for the failure probability of a new piece
of equipment, the information may be an estimate
by an acknowledged expert, and they refer to this
as being of medium pedigree. Another part of the
tree might require an assessment of the expected
reliability of staff during maintenance, and this
might be made by (say) a recent graduate who,
Funtowicz and Ravetz suggest, might be a source
of low pedigree.
Even high-pedigree data sources may lead to false
probabilities. As shown above, crucial conditions
during data collection and the assumptions made
in deriving results are often not recorded, and
the conditions under which the derived probabilities
are used predictively may be very different from
those under which the data was collected. But
confidence levels are not commonly assigned to
probabilities, so there is no recognition by fault-tree
users of when they are low. In any case, how can
confidence be derived in very low probabilities?
In most instances, adequate statistical data is
not available for the assessment of rare events,
and it could take years to discover whether the
assumptions on which estimates are based are valid,
or even reasonable.
But if some probability estimates are inaccurate,
what is their significance? In a simple example,
Freudenburg [Ref. 7] shows that in some cases
omissions can make a huge difference to the results.
Suppose, Freudenburg suggests, that in drawing
up a fault tree, analysts arrived at a failure
probability of 10-6, having identified
all but two risk factors, one of which made the
system more safe and the other less safe. And
suppose that the system would operate at the 10-6
risk level for 80% of the time, but the "real"
risk would in fact include operation at 10-3
for 10% of the time and 10-9 for the
other 10%. It might at first be assumed that the
two errors would cancel each other out in the
risk calculation, but this is not the case. The
actual calculation produces a result of: (0.1
x 10-9) + (0.8 x 10-6) +
(0.1 x 10-3) = 0.0001008001, which
is slightly greater than 10-4. Thus,
the omitted higher risk is not cancelled by the
omitted lower risk but dominates the result, even
though it exists for only 10% of the time. Its
omission leads to a distorted belief in the safety
of the system.
Thus, not only are subjective omissions and inaccuracies
almost inevitable, but they can also be of great
significance to the result of a risk analysis.
In the case of the submarine Tireless, not only
were the probabilistic calculations inaccurate
by many orders of magnitude but also the consequences
of this (on the defense of the nation) were potentially
catastrophic.
|
|
NEXT
PAGE
|
|
|
|
|
|
|
|