ADVERTISING FEATURE NORLAND
Issue 6, Oct/Nov 2009
How maintenance human factors impact operational risk
By Paul Saville-King, Norland Managed Services
“I have a Tier IV Data Centre so it’s near impossible to have an outage”
“It doesn’t matter who is operating my facility, we spent a fortune on designing it to be resilient”
Unfortunately, these statements are just simply not true. Why is it that otherwise sensible corporations just don’t get that a few hundred million design dollars doesn’t actually prevent a new hire, inappropriately inducted and poorly trained, being in a critical area and pushing the wrong button at the wrong time; thus shutting down a facility. Does this sound familiar?
People – or the human factors that influence them - are that untamed, unpredictable, unknown quantity that for some reason most people in technical environments shy away from tackling.
Evidence from Norland corroborates empirical research pointing towards people and process failure as the major source of a systemic infrastructure outage. This can be as much as 90% of the root cause of a business impact incident. Serious numbers indeed. Addressing the underlying human factors can therefore significantly improve the odds.
With this information I hope to embed this concept into your consciousness and change your way of thinking slightly. Consider the following table:
Unfamiliar task, at speed, no idea of outcome
Complex task requiring a high level of comprehension or skill
Restore system to new state following procedural checks
Totally familiar task, performed often, well motivated, highly trained staff, time available to correct errors
Respond correctly when there is an augmented supervisory system providing interpretation
Probability of Failure 55%
16% 00.3% 00.04% 00.002%
Source: based on a seminal piece of research by JC Williams in the 80s: HEART – Human Error Assessment and Reduction Technique.
In short, you have roughly a 50/50 chance of doing something wrong if you are performing an unfamiliar task, at speed when you are not 100% confident of the desired outcome. If that sounds like a common business critical situation or near miss to you then contrast that in the table above with a totally familiar task, performed often, by motivated staff, where the risk dramatically drops to just 0.04%. This reflects trained – even scenario drilled – and engaged staff working in an environment that takes a long term view of risk reduction.
Just when you thought it couldn’t get any worse; consider this. Condition multipliers can exponentially rack up the probability
50
www.datacenterdynamics.com
figures for lower risk activities. Take ‘unfamiliar with important and infrequent situation’ as a factor; you can multiply an underlying ‘task’ risk by seventeen times, thus putting it into the high risk category.
Even in this ‘ambiguous risk’ there are harder and softer elements of concern. On the softer side and key to overall risk are influences of local culture and behaviours. This is the foundation that makes ‘people and processes’ work effectively and hence is a fundamental requirement for a low risk, mission critical operation.
Behaviourally, the most common issues that we see relate to problems such as a failure to tackle ‘bad apple’ individual behaviours, silo mentality between different shifts, departments, companies and even service providers and finally a lack of risk awareness.
Culturally, the worst enemy of all is probably ‘fear of openness’. Recrimination and removal from site, often by over zealous management failing to think through their impact, can lead to long term cultural damage which will seed a future catastrophe.
A classic example was a weak middle manager of a media company who was afraid to approach his boss for capital investment of a couple of million dollars for an electrical infrastructure upgrade. That conversation would have saved them the multiple millions they lost through litigation, penalty and lost reputation. When his boss found out – he got fired – and the cap ex was fast tracked through the system to prevent another outage.
What does this all translate to in terms of the risk to your facility? Take a step back and have a think about this; hand on heart which one of the following are you?
1. You have robust site processes but ‘sense’ a lack of buy in and ‘feel’ the wrong culture. Are people hesitant to make decisions under pressure, or repeatedly make the wrong ones?
2. Your team is highly motivated, creative, proactive, open and willing but lack clear guidelines and processes?
If you have a strong ‘positive’ culture and great processes consider yourself one of the select few – the top 20%. The often overlooked aspect of human factors - culture and behaviours - is often present through default rather than design.
If you have a strong positive culture protect and nurture it; if not work on it – your facility could be up to 90% less at risk as a result. Surely that is a result worth having.
Visit:
www.norlandmanagedservices.co.uk Phone: 020 7871 9100
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52 |
Page 53 |
Page 54 |
Page 55 |
Page 56 |
Page 57 |
Page 58 |
Page 59 |
Page 60 |
Page 61 |
Page 62 |
Page 63 |
Page 64 |
Page 65 |
Page 66 |
Page 67 |
Page 68 |
Page 69 |
Page 70 |
Page 71 |
Page 72