Warning… This is going to be a long blog post with a lot of large bits of reference material to go through. Please bear with me as the message is quite important.
For years, industry pundits have been warning about the massive physical damage and loss of life that can occur as the results of cyber attacks. We have had government agencies prepare case studies demonstrating that demonstrate the cyber-attacks can cause physical damage to process plants. The most famous of these cases was the “Aurora” test that was staged (and yes, I mean that in the most pejorative sense of the word staged” by the Department of Homeland Security. The results of this staged event were widely reported on international news outlets like CNN.
In addition to the Chicken Littles crying out that the sky is falling, we’ve also had at least one case of a cyber attack that was successful – STUXNET. Of course, the dirty little secret is that STUXNET should have failed too, and only didn’t because the design of the equipment that was being attacked was flawed. The government has cranked up the marketing machine related to preventing cyber attacks up to a monumental level, culminating with an entire cybersecurity awareness month. Now that cybersecurity awareness month has ended I would like to confidently report that industry’s response has been a resounding – yawn. While most process industry plants have not performed much cybersecurity work at all on their Industrial Control System (equipment), leaving their IT departments to carry out some basic perimeter guarding, physical damage to process plants caused by cyber-attacks is virtually non-existent. Amazing, considering that the number of attempts at hacking into industrial control systems has been reported to exceed several kajillion attempts per hour.
How is this possible? Simple. Process engineers aren’t idiots.
The great preponderance of the cyber researcher community has absolutely no idea how process plants operate and how they are designed. One famous cyber researcher once made this asinine statement, “A lot of times the worst thing you can do, for example, is open a valve — have bad things spew out of a valve”. How dumb do you have to be to think that a process engineer created a plant where if a single valve is opened a catastrophe will occur… Seriously? In the balance of this blog post I am going to explain a bit how process plants are designed, assessed for risk, and safeguarded against risks.
Let’s start by having a sample plant to look at. We here at Kenexis have created a sample plant that we utilize for training class exercises. It is a small stripped down version of a natural gas production facility where gas well fluids are processed in separators to remove natural gas liquids (NGL), and the resultant gas stream is compressed to consumers, and the liquid stream is pump to another industrial user. The following drawings define the facility.
- This link will take you to a plot plan drawing which shows you a general arrangement of the facility – DWG-Plot Plan
- This link will take you to a set of drawings that include the process flow diagram (with heat and weight balance information) and the piping and instrumentation diagrams – PFD & P&ID’s
Once a process plant design has been created, the information that defines the plant (as shown above) is called the Process Safety Information (PSI). Once the PSI has been generated, the process plant is obligated (if the plant is covered by the OSHA Process Safety Management rule, which most plants where large scale bad things can happen are) to perform a Process Hazards Analysis. Most operating companies utilize the most comprehensive type of Process Hazards Analysis, the Hazards and Operability Study, or HAZOP. In a HAZOP study, a facility is broken down into “Nodes” of similar operating conditions and walked through a set of deviations, such as High Pressure, Low Temperature, Reverse Flow. For each of these guidewords, a multidisciplinary team determines if there is a cause of that deviation beyond safe operating limits. If so, the team determines the consequence if the deviation were to occur, and then lists all of the safeguards that are available to prevent that deviation from occurring – or at least escalating to the point where damage can occur. Kenexis has performed a HAZOP on its sample facility. A link to that HAZOP is given below. If you are not familiar with HAZOP studies, I encourage you to look at this document and understand the process.
In essence, when a HAZOP is performed a team of engineers look at virtually every failure that can possibly occur, and make sure that there are appropriate safeguards to protect against it. I assure you, if there was a single valve that could be opened to cause a catastrophe – it would not make it through this process.
With all of that being said, a little bit more work can be done to make absolutely certain that your plant is inherently safe against cyber-attack. That additional step is what I am calling a HAZOP Cyber-Check. As you can see in the HAZOP study, for every deviation that was encountered, a CAUSE for that deviation was identified, and all of the SAFEGUARDS against that cause were also listed. From a cybersecurity perspective, you only have a problem if initiating the CAUSE while simultaneously disabling all of the SAFEGUARDS must be possible from the ICS equipment. This cyber-exposed position is rarely true, and can virtually always be made to be untrue. If there are no situations where this situation is true, then your plant is Inherently Safe against cyber attack.
A little more explanation…
With respect to initiating events, or causes, in order to be cyber-exposed or “hackable” the physical cause discussed in the HAZOP would have to be made possible through a virtual command from the ICS. So, whereas a “flow control valve going closed” is “hackable”, the cause of “operator inadvertently opens of bypass valve” is not – assuming that the bypass valve is hand-cranked, and not actuated from the ICS. So, the first step in the HAZOP Cyber-Check is going through each deviation’s cause to determine if it is “hackable”.
The next step is to review the SAFEGUARDS to determine if they are “hackable”. Any operator and computer controlled safety instrumented function is “hackable”, but many safeguards are not. The following list of safeguards are mechanical devices that cannot be “hacked”.
- Pressure relief valves
- Non-return Check Valves
- Mechanical Overspeed Trips
- Hard-wired overcurrent/undercurrent relays in pump/compressor motor starters
- Analog Safety Instrumented Functions (i.e., analog transmitter, analog current monitor relay, solenoid/interposing relay)
If the deviation in the HAZOP includes any of the above safeguards, then the deviation cannot be generated through a cyber-attack and is thus considered not “hackable”.
At this point, I’d like to circle back to the famous events where a cyber attack caused physical damage. In the case of STUXNET, a simple low-cost mechanical overspeed trip of the centrifuges could have prevented the machines from being damaged. Why was a mechanical overspeed trip not included? Overzealous cost-cutting engineers over-relied on programmable systems, and paid the ultimate price. I think that this is an especially important lesson for engineers designing overspeed systems for turbines in the electrical power generation industry. Don’t let your hubris get the best of you. A mechanical overspeed trip might save your butt one day.
The other famous cause was the “Aurora” demonstration. The damage was caused to a turbine by generating an overspeed condition. In order for this to have occurred, the machine would have to not have been equipped with a mechanical overspeed trip, and also not have been equipped with a overcurrent/undercurrent relay in the driver. Is this credible? Let me put it this way. Buying an industrial turbine without undercurrent/overcurrent on the electric drive and without a mechanical overspeed trip is about as likely as going into a car dealership and walking out with a car that doesn’t have seat belts or air bags.
If when going through the HAZOP Cyber-Check you get through the CAUSE and SAFEGUARDS and find that everything is “hackable” you then need to look at the consequences to determine if that deviation results in a significant consequence. If not, that attack vector is essentially a nuisances that we’ll leave to our traditional cyber-security. If the consequence is significant, then it is incumbent upon the analyst to make a recommendation to add a non-“hackable” safeguard. This is not required as frequently as one would think, and is not that difficult to do. The worst case scenario is that you will have to “mimic” one of the programmable electronic system safety instrumented functions with an analog pathway. For instance, if you have a high reactor temperature opening a depressuring valve as a safety instrumented function, if it is located in a safety PLC, then in theory it can be hacked. But, if you use an analog signal splitter on the 4-20 mA single from the temperature transmitter, include a currently monitor relay on the split 4-20 mA loop, and then connect another solenoid valve to the pneumatic circuit of the depressuring valve – poof, you’ve made an analog secondary pathway for that safety instrumented function that is hack-proof.
Going through this process you can make any process plant Inherently Safe against cyber attack. This doesn’t mean that you’re engaging in some silly red/blue battle in the cyber-domain, this means that nothing can be done from the cyber-domain that can damage your plant. Forget about a terrorist sneaking in… In this type of plant, you can sit that terrorist down at an operator station with full access to all operator terminals and programming terminals, and even give him ample training on how the plant and control systems work, and he still won’t be able to cause any damage.
All of this is possible with a HAZOP Cyber-Check. For your reference, I have included a HAZOP Cyber-Check worksheet that I created as a review of our sample plant. It can be accessed at the following link.
I have to admit that when I went through this exercise, I thought there would be more findings, but it turns out that right out of the box, the plant is Inherently Safe against cyber-attack. The only recommendations that were made were to ensure that the overcurrent/undercurrent interlocking built into the pump/compressor motor starters was in place and functional. I think that you’ll find that is common in the process industries, and why even through kajillions of attacks are taking place, nothing is really happening. I highly encourage everyone in the process industries to take on this additional task after each PHA. There’s no reason that you can’t do it yourself, but of course Kenexis is available to help. I did the above HAZOP Cyber-Check in about 2 hours – so it is really a cost-effective insurance policy. I’m sure that you can get cyber-checks done on all of your HAZOPs for less than the price of the maintenance agreement of the software running on your network intrusion detection devices…
I’m very interested on hearing your thoughts on this. Please feel free to contact me personally, or comment directly to this blog.