Even though I have been doing SIS design for about 20 years now, I never seem to run out of new and interesting problems and questions to ponder.  Of course, this is probably a function of the fact that I am a consultant who is continually exposed to different processes and usually only get involved when the problems are complex…

On a recent project I was faced with a dilemma of when to allow credit for testing and repair of failed SIS components.  The question, as posed, is deceptively simple.  There are slots in your standard PFD equations for test interval and repair time, it would seem obvious that you always take credit for them.  In reality, I have determined that this is not always the case.  What you really need to consider is when and why did the failure evidence itself, and what action is being taken in response to the failure to return to a safe state.  While I can’t share the details of the specific project I am working on, I will provide you with an analogy where what is the right and wrong thing to do are much more obvious.

There are numerous human interactions with the SIS that are considered when performing SIL verification calculations.  But there is a limit to the beneficial effect that can be credited with human involvement.  The requirements for performing SIL verification calculations are presented in IEC 61511 in clause 11.9 – SIF Probability of Failure.  This clause begins with sub-clause 11.9.1 which states, “ The probability of failure on demand of each safety instrumented function shall be equal to, or less than, the target failure measure as specified in the safety requirements specifications.  This shall be verified by calculations”.  This is the clause that essentially says that a calculation must be done, and that calculation must achieve the specified target.  The next clause lists that things that must be considered when performing the calculations, and states, “The calculated probability of failure of each safety instrumented function due to hardware failures shall take into account”, and then proceeds to give a list of attributes of the hardware design that should be considered.  What is important to note here is that the clause stresses that the probability of failure on demand is a function of the “random hardware failures” and does not take into consideration human actions other than human actions that cause the SIF to fail.

When performing SIL verification calculations it is customary and proper to consider testing and maintenance activity, and the beneficial effect it has on the availability of the shutdown function.  When a test is performed before a demand and that tests evidences a failure which is repaired.  The probability of the SIF being operational when the actual demand comes is higher.  What is not customary is to consider the manual response to a failed SIF to be part of the SIF with respect to calculations.  What we calculate with SIL is the probability that the hardware system will operate.  What is not included is the probability that the operator will manually get the process to a safe state even if the SIF fails.

Let me give an example that I hope will help to illustrate when human intervention can and cannot be considered when calculating the SIL.  Consider an oil separator on an oil production platform that separates oil from gas.  Let’s say that there is a high level switch in the separator that will close an inlet shutoff valve to the separator to prevent overfilling the separator.  Let’s also assume that the inlet shutoff valve has a limit switch to determine if the valve closed upon command.  If this high level shutoff is tested on an annual basis, and any failures are repaired, then this testing and repair activity are considered when performing SIL verification calculations.  The reason is that these actions increase the probability that the SIF will work when commanded.  If, on the other hand, there is a high level situation in the vessel which commands the valve to go closed, but it doesn’t, then the SIF has failed.  At this point in time, it is reasonable to expected that the operator will see the limit switch alarm on the shutoff valve and take a manual action to close a separator control valve or even call out to the field to close a manual shutoff valve.  While these actions will in fact reduce risk, and can be reasonably expected to occur – they are NOT to be considered as part of the SIF.  In this example the SIF (specifically, the hardware that comprises the SIF) has failed!  Just because there is an operator action that can have the same effect as the SIF does not mean that the SIF did not fail.

While the above statements may seem obvious for the particular situation that I have presented above, not all SIF are so simple and obvious.  The overall general rule that I would present at this time is that credit for testing and repair should only be taken if the testing which evidenced the failure resulted in a repair of the SIF, and that the SIF hardware will return the process to the safe state when required.  Manual actions that can have the same effect as the SIF that are taken should not be considered when calculating the achieved SIL of a SIF.