two workers at production line in plant

Is Your System Recoverable?

Blog post by Joe Palace, Senior Electrical Engineer

Having a robust system is having a system that recovers well after a system fault. It’s the difference between minimal downtime and significant downtime resulting in loss of production and loss of revenue.

When designing a controls system it’s good to think about recovery while writing your code. The Controls Engineer knows the system and application very well-  but it’s usually the end user, the operator, who runs it and is present when the system goes down. All the operator cares about at that moment is to get the immediate fault cleared as fast as possible and return the system to normal production.  Here are a few things to consider to make the recovery process simpler and faster for the operator:

Fault Messages

A good system has fault messages that are very descriptive to what the actual problem is. Whether it’s an e-stop push-button pressed, a gate open, an incorrect part in the system, or a communication fault on a particular node, alarm messages must distinctly point out the nature of the problem.


Also, the HMI must be intuitive enough to point the operator in the right direction to recovery. Like the alarm messages, proper visual indications of safety devices, job information, part information, and direction of flow are crucial for the operator to gain confidence in the system that he is operating. The color of the visual graphics should clearly define the status of that device (i.e., Green is good and OK; Red is bad and faulted). Also, when the fault occurs, to get the operator’s attention, ‘flash’ the indicator device that’s faulting, whether it’s on the HMI screen or on the operator console. In addition, job information on incoming parts and processed parts should be legible. Make sure the font is large enough for the average person to read.

Steps to Recovery

And finally, a controls system should clearly define the steps to recovery. What does that operator need to do when bringing the system back up to normal production? Is just a Fault Reset required or is there more to it than that? The operator eliminates the fault first (ex. reset e-stop button), and then:

  • Presses the fault reset push-button
  • Presses the continue button OR
  • Puts the system back into auto mode to continue production

Operator prompts are helpful when the system is in the process of being recovered from a fault condition. Does the fault reset push-button clear the fault? If the answer is yes, then prompt the operator to perform the next step.

Usually, the fewer steps the operator has to perform to recover, the better. It’s best to keep the recovery process as simple as possible. Remember seconds count in a manufacturing production environment. No need to stress the operator anymore with a complex recovery procedure.