Architects or Tinkers?

by John Covan
 


Long ago, a tinker was an itinerant fixer of pots and pans. This person would use a small amount of solder, called a dam, to repair pewter ware - hence, the phrase "not worth a tinker's dam" reflected his low social status. Even now, "to tinker with" implies hasty and shoddy work.

Besides being low on the totem pole, the tinker's job could be characterized for what it was not: designing and fabricating. The tinker would rarely find himself in the position of designing a pot whose handles would not fall off, perhaps putting himself out of a job! He rather found his work to be after-the-fact, patching designs that may have been defective for generations.

Alternatively, the architect was a person of high social status who could aspire to design and build lofty, awe-inspiring cathedrals such as St. Paul's (by Christopher Wren) in London. Because the consequences of failure (i.e., collapse) were so much higher, architects were obliged to gradually improve their designs, and there is nothing like success to maintain one's social status. In comparison over the long term, it seems the architect will be longer and more fondly remembered than the tinker.

This opinion piece focuses on the role we see for ourselves as system safety engineers. Are we more like the tinker or the architect? Hence, do we evaluate and patch, or design and build? My observations suggest that most of us retroactively analyze and repair existing systems, making us the Ghostbusters of the system safety world. In drawing these analogies, I am not trying to suggest that, as system safety engineers, we do shoddy work, but rather to imply that there is a limit to the value of patching, and that we can become more than tinkers.
 

"The best way to make a silk purse out of a sow's ear is to start with a silk sow."


Surely, there are systems that do just fine with constant or occasional patching. These systems are characterized by their lack of failure modes that lead to high or very high consequences. This type of system can tolerate repeated failures and still accomplish its mission; and our role is to ensure that such failures do not occur too often. At the other end of the spectrum is the very high-consequence system that is subject to a "one strike and you're out" rule. An example is the malaise of our domestic nuclear power construction industry following the Three Mile Island accident. This type of system must aspire to the goal of predictable safety so that no catastrophic failure can reasonably be expected to occur over its lifetime.
 

NEXT PAGE »