After so many years working in offshore and high-risk environments, I have witnessed many incidents that could have serious impact for people, assets integrity or environment, and I have learned that fail safe philosophy is not just an engineering term or something that belongs only in design documents, it is vital tool in our hands. For me, it is one of the clearest ways to judge whether a workplace is truly protected or simply getting by on good fortune.
In oil and gas and offshore industry, we deal with hydrocarbons, pressure, ignition sources, lifting operations, confined spaces, and simultaneous activities every day. In that kind of environment, safety cannot depend on people always reacting perfectly or on conditions staying favorable. It has to be built into the system from the start – planned and integrated into the working system.
I have seen many situations over the years where nothing bad happened, and people were tempted to say everything was fine – FAIL LUCKY. But when I looked closer, it was not fine at all. The controls were weak, the barriers were compromised, and the only reason nobody got hurt was because luck happened to be on our side that day. That is exactly why I believe fail safe thinking is so important. It pushes us away from luck and toward real protection, where barrier are in place and risks assessed for each single step.
I don’t know if many HSE proffesionals are considering the fails safe philosophy in their working places, or are familiar with this term, that’s why came in my mind the idea to share it and explain it as best as I can to all. Stay with me….
What Fail Safe Philosophy Means
When I talk about fail safe philosophy, I mean designing and operating equipment, systems, and tasks so that if something fails at any point, the outcome moves toward a safer condition rather than a more dangerous one – planning is the key from the early stage of design.
That does not always mean everything shuts down completely. The safe state depends on the hazard. Sometimes the safest action is to stop. Sometimes it is to isolate hydrocarbons, remove power, cut fuel, stop motion, or vent pressure through a controlled route, install barriers, exclusion zones – spotters for lifting operations/ working at heights, etc. The principle is simple: when failure occurs, the system should help protect us, not expose us further.
This matters offshore more than in any other industry, because things can escalate quickly and offshore is REMOTE ASSISTANCE – no other place where to escape. A small loss of control can become a fire, explosion, dropped object, serious injury, or major process event much faster than many people expect – see for example Piper Alpha incident or more recent Deepwater Horizon. That is why I never like to hear anyone say, “It’s probably fine.” In high-risk operations, “probably fine” is not a barrier.
Fail Safe and Fail Lucky Are Not the Same Thing
One of the most important distinctions I try to explain in training is the difference between fail safe and fail lucky.
A fail safe condition is when something goes wrong, but the system has been designed and planned so that it automatically moves to a safer state. Barrier management was considered.
A fail lucky condition is when something also goes wrong, but nobody gets hurt/ asset integrity affected/ environmental events, ONLY because of timing, coincidence, or good fortune.
THAT DIFFERENCE IS HUGE!!!!
A fail safe outcome means the system protected us.
A fail lucky outcome means the system did not really protect us – we simply escaped the consequences by chance at that time.
In my experience, a lot of serious incidents are preceded by fail lucky events. People see that nothing happened and assume the control was good enough. In reality, the workplace was vulnerable, and the warning was missed.
Where I See Fail Safe Philosophy in Offshore and Oil and Gas Work
In oil and gas and offshore industry, fail safe philosophy is everywhere when the operation is designed and managed properly. I see it in:
- emergency shutdown systems
- fire and gas detection systems
- shutdown valves
- safety instrumented functions
- blowout prevention arrangements
- machine guards and interlocks
- lifting safeguards
- isolation and LOTO systems
- pressure protection devices
- permit to work controls
- daily planning meetings
- barriers management/ exclusion zones
- risk assessments well prepared
These are not just technical features. They are part of how we prevent an abnormal situation from turning into a serious event.
On offshore assets especially, we do not have the luxury of treating these things casually. The environment is already unforgiving. Limited space, complex systems, multiple contractors, weather, logistics, and hydrocarbon risk all make it even more important that barriers work the way they are supposed to work.
Fail Safe Philosophy Examples to Explain in a Better Way the Concept
I always find that people understand this topic better when it is explained through practical examples rather than formal definitions alone.
1. Emergency Shutdown Valve That Fails Closed
A good example of fail safe philosophy is an emergency shutdown valve on a hydrocarbon line that is designed to fail closed.
In normal operation, that valve may stay open because it has power or instrument air holding it in position. But if power is lost, air pressure fails, or an ESD signal is activated, the valve automatically closes.
Why is that fail safe? Because the system is designed so that losing control energy does not leave hydrocarbons flowing freely. Instead, it pushes the process toward isolation, which is the safer condition in that scenario. That means the design is helping us reduce the chance of escalation, fire, or loss of containment.
I like this example because it shows that fail safe is not about convenience. In fact, fail safe actions often interrupt production. But that is exactly the point: when the choice is between continued operation and protection, the system must choose protection.

2. Burner Management System That Cuts Fuel on Flame Failure
Another strong example is a burner or heater system that automatically cuts off fuel if the flame is lost.
If the flame goes out but fuel keeps flowing, the result can be unburned fuel accumulation and then ignition, which can lead to a serious explosion. A properly designed burner management system prevents that by shutting off the fuel as soon as flame failure is detected.
That is fail safe because the system does not wait for someone to notice the problem and react in time. It removes the source of danger automatically. It understands that failure has occurred and responds by moving to the safer condition.
I often use this example because it makes the principle very clear: when the protective logic sees that safe combustion no longer exists, it does not allow the hazardous condition to continue.
3. Machine Interlock That Stops Equipment When the Guard Is Opened
A third example is rotating or moving machinery fitted with an interlock that stops the equipment if a guard is opened or removed.
If the guard is there to protect people from moving parts, then allowing the machine to keep running after that barrier is removed would leave the person exposed. In a fail safe arrangement, opening the guard removes power or prevents the machine from operating.
That is fail safe because the hazardous motion is stopped when the physical barrier is no longer in place. The system is not relying only on someone remembering procedures or being careful enough. It is designed so the unsafe condition cannot continue easily.
This kind of protection is especially important because people become familiar with machinery. Familiarity can create shortcuts, and shortcuts around moving equipment can be deadly.
Why Fail Lucky Is So Dangerous
What concerns me most in many workplaces is not only what is clearly unsafe, but what looks safe because nothing bad has happened yet – WE ARE NOT LUCKY EVERY DAY!
That is where fail lucky comes in.
I have seen examples such as:
- a dropped object that missed people because no one was standing below
- a wrong isolation where no release occurred because the line had already depressurised
- a gas detector out of service while no gas leak happened during that period
- a swinging crane load that missed personnel by a narrow margin
- a missing guard that caused no injury only because no one touched the danger point
These are not success stories. They are warning signs.
The problem with fail lucky events is that they can make teams comfortable with weak controls. People begin to believe the system is acceptable because the outcome was harmless. But the outcome was harmless for the wrong reason. It was not because the safeguards were strong. It was because luck stepped in.
I always try to challenge that mindset. In HSE, I do not want a system that survives by luck. I want a system that protects people even when something goes wrong.
How Fail Safe Philosophy Connects to Barrier Management
Barrier management is where fail safe philosophy becomes especially practical.
Every offshore or oil and gas workplace should know what barriers are in place to prevent a top event or reduce its consequences. Those barriers may be physical, functional, human, or organisational.
Physical barriers include things like valves, guards, blast walls, relief devices, and containment systems. Functional barriers include alarms, shutdown logic, detectors, trips, and interlocks. Human and organisational barriers include procedures, supervision, training, toolbox talks, shift handovers, and permit controls.
For me, fail safe philosophy strengthens barrier management because it makes those barriers more dependable under failure conditions. A barrier is always stronger when loss of power, signal, pressure, or control causes it to move toward safety rather than uncertainty.
That is why I pay close attention to:
- barrier impairments
- overrides and bypasses
- proof testing
- degraded mode operations
- maintenance backlog on critical safety elements
- whether people understand what barriers are protecting them
A barrier that works only when everything is perfect is not enough in offshore industry.
How It Is Integrated Into the Workplace
One mistake I sometimes see is people treating fail safe as something that only designers or control engineers need to worry about. totally disagree.
- Permit to Work systems
- LOTO and isolation planning
- risk assessments
- job safety analysis
- startup and shutdown procedures
- emergency response arrangements
- management of change
- maintenance and inspection programs
- competency and workforce training
For example, if a detector is bypassed, an interlock is overridden, or a shutdown device is unavailable, that should never be treated as a small administrative detail. It means the workplace may no longer be fully protected in the way it was designed to be. That has to be assessed properly and controlled.
This is where mature HSE culture makes a difference. The best workplaces are not the ones that simply keep running. They are the ones that recognize when protection is degraded and respond before an incident occurs.
How Fail Safe Philosophy Helps Us
From my perspective, fail safe philosophy helps in very practical ways.
- It helps protect people from injury and fatality.
- It helps prevent escalation when abnormal conditions appear.
- It strengthens barrier reliability.
- It reduces dependence on fast human reaction under pressure.
- It improves process safety performance.
- It exposes hidden weaknesses before they become incidents.
- It encourages better decision-making at worksite level.
Most importantly, it changes how we interpret events. It reminds us that a good outcome does not automatically prove a good system.
That lesson is critical offshore. Too many organizations look only at whether something bad happened. I prefer to look at whether the barriers were actually effective: which barriers were effective and which barriers failed. That tells me much more about the true health of the operation.
For me, Fail Safe Philosophy is one of the clearest indicators of whether a workplace really understands major hazard risks. In oil and gas and offshore industry, the stakes are far too high to depend on luck, memory, or last-minute reaction.
A properly managed operation should be designed so that when things go wrong, systems isolate, shut down, de-energise, or otherwise move toward the safest possible state. That is real protection that Fail Safe Philosophy is giving.
At the same time, fail lucky events should never be dismissed. They are often the warning signs that a barrier was weak, missing, bypassed, or not understood. Ignoring them is how workplaces drift toward major incidents.
My view has always been simple:
the goal is not to keep getting away with things.
but rather he goal is to build systems and behaviours that protect people even when failure happens.
That is the real value of fail safe philosophy.
Are you looking for affiliate partnerships, marketing, professional collaboration, let’s get in touch and discuss. Leave your request on the contact form or contact us directly at [email protected].
As well if you are interested on getting a copy of the 2026 Edition – Guideline For Thorough Incidents Reporting and Investigation, SUBSCRIBE to HSE Smart Solutions and you will get it by email. First 100 subscriptions will get a free copy.

Leave a Reply