Continuing in definitions…

The definition for common cause failure was greatly cleaned up…

The original definition was: failure, which is the result of one or more events, causing failures of two or more separate channels in a multiple channel system.

The new definition is: concurrent failures of different devices, resulting from a single event, where these failures are not consequences of each other.

The new definition more clearly indicates that a single situation can cause multiple simultaneous failures of redundancy equipment items.  Also, the new definition includes five additional notes providing more detail, examples, and a bit of insight into why we are concerned about common cause.  I really like Note 3 – The potential for common cause failures reduces the effect of system redundancy or fault tolerance (e.g., increases the probability of failure of two or more channels in a multiple channel system.  This is a point that I always emphasize in training classes that is not always well understood.  There is a limit to the additional risk reduction provided by adding more redundant components.  If you consider even a miniscule amount of common cause, such as a common cause failure “beta” factor of 1%, you will see that adding a second device provides a great additional risk reduction, but the third device (i.e., 1oo3 voting) provides very little benefit over the second, and additional of a fourth is essentially useless.

The definition for common mode was also slightly changed and had notes added.  Not significant enough to talk about…

Another term was added – 3.2.8  compensating measure.  The name that we at Kenexis had always used for this, as can be seen in the bypass risk analysis worksheets in the resources section of the web site, is “alternate protection plan”.  In essence, a “compensating measure” is the alternate action that is taken to replace a SIF that has been removed from service in order to repair a failed component.  The official definition is: planned and documented methods for managing risks that are implemented temporarily during any period of maintenance or process operation with known dangerous faults or dangerous failures in the SIS.