A comparison between PSMR findings in 1999, and post incident 2003: Attitudes, values and behaviours – the cause of the systemic failure

 

Attitudes, values and behaviours – the cause of the systemic failure

 

1999 PSMR findings

September, 2003, information from Shell internal investigations and HSE public statements

 

·         Inappropriate attitudes, skills and behaviour were evident specifically in the operation of the four Brent installations

·         Inappropriate attitudes, skills and behaviour on Brent Bravo contributed to the fatalities.  Offshore staff have been ‘conditioned not to challenge’ by leaders and Managers

·         Violation in operating plant and chronic non compliance with safety related maintenance driven by the need to meet production targets under the gas nomination contract – the touch FA instruction

·         Violation in operating plant and neglect of maintenance.  Offshore staff willing to continue to operate with systems in a potentially dangerous condition with no assessment of the consequences of so doing.   The continuation of production was clearly allowed to take priority over safety

·         Violation and deviation from the PTW was common.  Examples of PTW violation included not visiting the work-site, and issuing a number of permits simultaneously to one work-site supervisor.  Asset teams had markedly different interpretations of what can or cannot be done within the PTW.  On Brent Bravo a lot of work was being done under operations rules to avoid raising a permit – a permit might after all get knocked back in the Touch FA climate.

·         A significant failure in management controls offshore was that the men who went to repair the leak on the temporary patch did not raise a PTW and entered the leg without full leg entry procedures being used.  It had become ‘custom and practice’ to carry out as much work as possible under the operations umbrella

·         Corrupt and flawed independent verification process - goal widening rather than goal setting regime, if the Safety Critical Equipment (SCE) could not meet its approved * performance criteria then the criteria of acceptance was simply raised, e.g. for ESD valve LOT from 1 scm/m to 20scm/m

* approved in this sense was the reliability criteria of SCE to perform as required used in the QRA assessment of the IRPA, PLL and TR impairment frequencies quoted in Safety Cases for Brent facilities

·         Significant weakness in the historic data from the performance testing of safety critical elements.  Test records for principal ESD valves falsified, tests signed off as OK but test not carried out.  There was evidence of significant shortcomings in duty holder arrangements for securing SCE integrity and the associated verification arrangements

·         Deficient Corporate internal audit and management review process

·         Many defects highlighted by the post incident Technical Integrity Review pan Expro were not picked up by the internal governance and review processes

·         Inadequate communication and consultation with the workforce particularly on enhanced risk levels at the workplace

·         Risks on majority of Expro installations above ALARP but no discussion or communication of these risks to the offshore crews

·         Throughout the organisation people are aware of the problems and issues but remain passive – they are generally afraid.  Some senior staff indicated in interview to audit team members that they are reluctant to speak out on some major concerns that they have because they will not get support from senior management

·         Offshore staff afraid to flag problems that they have with hardware or procedures on the platforms

·         In enhanced Expro the Asset Manager role is a powerful position and there appears no independently robust check and balance, specifically within operations of the Brent field.  Production was to be maintained at all cost.  The offshore installations were micro managed on a day by day basis from onshore by the Asset Manager and his deputy.  Decision-making, risk taking was the prerogative of the Asset Manager who established and maintained this negative safety culture through instructions such as Touch FA and his acceptance of daily violations of accepted operating practice and procedures.  The OIMs essentially did as they were told.  It was not that the Asset Manager was ignorant of the risks he took, but rather he did not seem to care

·         Offshore staff willing to continue to operate with systems, which they totally acknowledge, was in a potentially dangerous condition with no assessment by them of the risk of so doing.   Offshore staff appear to have been ‘conditioned’  not to challenge by leaders and Managers

PSMR Findings summarised as in the final presentation to Expro 22nd October 1999

HSE investigation summary from Public Statements after Stonehaven Court sessions

·         In the Brent Field Unit there were significant and entrenched weakness in management controls.

·         The fundamental reason for this is not absence of structures and controls but rather the inappropriate attitude and behaviour that causes violation, non-compliance and deviation from these structures and controls.

·         This inappropriate behaviour exists from the Brent GM downwards.  We believe that key business drivers and messages from the corporate level in Shell Expro are fostering this undesirable behaviour

·         Shell admitted to fundamental failures in health and safety management. 

·         Essential barriers to the unplanned release of hydrocarbon gas that should have been in place were not.  The incident had very serious underlying causes mainly the failure to maintain known defective equipment and the failure to assess the potential consequences of this, along with generally neglecting maintenance and allowing continuation of production to take priority over safety


 

 

Hardware issues, the symptoms of attitudes, values and behaviours

 

1999 PSMR audit findings

September, 2003, information from Shell internal investigations and HSE public statements

 

·         Oil test separator being used for considerable period with defective LCV (gross leakage past LCV).  CRO unable to control level without manual switching of downstream XCV from control room panel.  Leakage from stem of XCV due to constant chattering

·         LCV on Closed Drains De-gasser Vessel known to be passing but operations of de-gasser continued for prolonged period prior to incident.  NRV on De-gasser run down line defective at time of incident, missing internals – no risk assessment or technical approval to operate

·         Malicious falsification of performance tests on Brent Bravo gas riser ESD valve

·         WO for gas riser ESD Valve signed off as OK when test not carried out.  Pan Expro review raised concerns re the performance testing of a significant number of pipeline ESD Valves

·         Performance reports from Brent Asset Manager into corporate system indicated compliance with safety critical maintenance around 96%.  PSMR findings verified false reporting with actual compliance on Brent Bravo less than 15% mainly due to chronic deferment to reduce risk of shutting down production by accident

·         HSE investigation stated serious underlying cause was that maintenance had been neglected

·         Maintenance deferred to prevent potential loss of production (Touch FA instruction).  When equipment was eventually tested a significant amount failed to meet their performance criteria, e.g. seawater deluge systems and  Fire and Gas sensors

·         The ESD Valve on outlet from the Brent Bravo HP Flare and Blow-down Vessel was observed after the incident to have failed in open position.  Brent Bravo follow-up highlighted fire and gas protection and detection systems which had been out of service for lengthy periods.  Test data indicated high levels of failure under test e.g. on testing as part of post incident technical review 14 flammable gas and 2 oil mist sensors failed to danger.

·         PSMR finding raised concern re the growth of temporary repairs.  Specifically the concern was related to hydrocarbon pipework (a major gas leak had occurred on North Cormorant at a temporary repair).  The OIM’s on BB and NC had no idea of the amount and extent of temporary repairs on their installation; there was no register of these.  Also no clear idea of what was the definition of temporary and no focus or plans to make temporary permanent and no evidence of approval of the repairs by a technical authority 

·         Expro found that there was for a number of assets significant gaps in the overview of pipeline repairs.  There had, because of the increasing number of repairs been a relaxation on the definition of temporary to increase the useful life of these repairs.  The post 2003 incident data revealed an almost exponential rise in temporary repairs approaching 500 pan Expro and rising as deviations flooded in.  Circa half the repairs had not been approved by a technical authority.  There was no corporate overview of the extent of these and therefore the associated increase in residual risks.

·         There was evidence that the Brent Asset Manager authorises changes to safety critical plant and equipment (variance and change control) with no assessment of the risks of so doing and no prior approval from a technical authority.  The technical authorities in Seafield House raise concerns but are essentially ignored.  As an example, to reduce cost when the Drilling seawater caisson failed, the firewater pumps were permanently tied into the seawater main.  At the time the PCV discharging seawater to sea from the seawater main was known to be jammed open. This unapproved change, in breach of PFEER regulations would make the firewater system almost useless in an emergency 

·         Post the incident pan Expro over 200 temporary repairs had been carried out on pipework with no risk assessment or prior approval of a technical authority. On Bravo alone there were 33 temporary repairs of which 9 had not been approved.  Another example was that a number of LOT on ESD valves had been non-compliant but the results had not been communicated to the relevant technical authority. 


 


Text Box: FIGURE TWO

 

Text Box: FIGURE ONE

 

Comparative Analysis: Missing barriers 1999 c.f. 2003

 


 

Summary of Hazards and Effects

 

The diagrams Fig 1 and Fig 2 are part of the Shell Hazard. Effects and Management Process, (HEMP) known internally as the ‘bow-tie’ diagrams.  They show on the left-hand side the barriers to prevent the top event occurring and on the right-hand side the barriers that reduce likelihood that top event will escalate.  It can be observed on both diagrams that there are systemic weaknesses impairing the proactive barriers on the LHS and the reactive barriers on the RHS. 

 

An offshore installation operating like this is so well removed from conventional risk levels as to fit the description of being operated in a dangerous condition. 

 

Residual risk levels As Low as is Reasonably Practicable (ALARP) could only be achieved for Brent Bravo with all its safety systems in full operational condition such that the technical integrity of the installation can be maintained.  Risks to the individual on Brent Bravo under these conditions are probably greater than 1 fatality per 10 years.

 

What does operating in a dangerous condition mean, can it be quantified?

 

Society and industry tend to agree that the dividing line between tolerable and intolerable risk of those individuals that obtain commensurate benefits from the activity is around 10-3 per year.  The concept of Safety Case legislation was for a duty holder to demonstrate that risks were reduced to ALARP rather than a fixed level.  Thus with expenditure on risk reduction projects etc by the Duty Holder the Individual Risk Per Annum (IRPA) on Brent Bravo would have been in the broadly acceptable range of 10-5 or 1 in 100 000 per year. 

 

When I state that the risks on Brent Bravo were such as to make it a dangerous place to work you need to refer to the Fig 1 and 2.   Here you will note that there were significant defects in both the control or proactive barriers to prevent the ‘top event’ and the reactive or and mitigation barriers to prevent escalation and thus reduce the consequences of the ‘top event’.  

 

In this condition risks are subject to what I call the Sigma effect, in that algebraically the incremental risk from all the individual deficiencies are additive.  These risks may be tangible, e.g. the risks of operating the oil test separator outside its design parameters, and intangible e.g. the combined effects of violation from procedures such as the PTW.  The bottom line is that any single deficiency will raise risk above ALARP but the Sigma effect raises the risks significantly to dangerously high levels. 

 

There is simply no legitimate methodology for accurately assessing these risks.  Quantitative Risk Analysis (QRA) can only be used to determine the residual risks on the installation by estimating the failure frequencies of competently designed, installed, commissioned and maintained safety systems. 

QRA cannot be used to assess the residual risks where safety systems have already failed or have had their function seriously impaired but despite this the installation continues in full operation. 

 

So in assessing the risk levels attained we are simply left to guess.  My guess would be that the IRPA on Brent Bravo in September 1999 and 2003 was in the region of 10-1 or higher for the most exposed workers.  This is some 100 times higher than what Society accepts, and circa some 1000 to 10,000 times higher than the figures published in the Brent Bravo Safety Case. 

 

With regard to the combined effect of deficiencies shown on both Fig 1 and Fig 2 Temporary Refuge Impairment Frequency (TRIF) would also be high – my guess would be at least 10-1, some 100 times higher than the mandatory limit.

 

Does it matter how long the dangerous levels persisted?

 

Essentially in operating the Brent Bravo in the condition as observed in 1999 and 2003, and for the four years in between, the duty holder was gambling with the lives of the employees on the installation.  The dangerously high levels of risk pertained from day to day at a constant level but as time passed then the chance of the major accident event occurring increase with time (exposure time). 

 

A simple example would be playing Russian roulette with a gun with one bullet in a chamber and with 5 empty chambers.  Every time the player pulls the trigger the absolute probability that he will survive is 5/6.  For each event it remains constant.  However, the chance that the unfortunate player will be alive after 10 events is remarkably slim due to the multiplication rule of probabilities.  Getting back to Brent Bravo, the situation if expressed mathematically would be as time approaches infinity, the probability of major accident event occurring approaches 1, or certainty. 

 

The implication of all this should be obvious.  If conditions on Brent Bravo as observed in September 1999 did not improve, and improve significantly, then a major accident event would occur, the only question was when


 

REST OF SHELL EXPRO

 

Detailed Data from post BB Technical Integrity Review

September/October 2003

 

ESD Health (Technical Integrity deficiencies)

 

Platform

Comments

Brent Bravo

Work Order (WO) signed off as OK when tests not carried out, WO’s signed off as OK when using wrong test method and known fault on system.  WO cancelled to carry out corrective maintenance with faults still present on valves

Riser ESDV measured leak accepted for average value not the maximum value which is the criteria, if maximum valves were used valves fails test

 

Pan Expro

 

Platform

Comments

BA/BC/BD/CA/DA/TA/AN/FA/GA/SW and Nelson

Various comments but in summary all in RED with variety of technical integrity anomalies

 

----------------------------------------------------------------------------------------------------------------------------

Fire and Gas Devices reviewed that Failed to Danger*

 

Failed to Danger include where data not available in SAP or when cleaning/preconditioning carried out prior to test

 

Specific to Brent Bravo

 

Device

Total Number Failed to Danger

Flammable Gas

14

Oil Mist

2

 

Pan Expro – BA/BC/BD/CA/DA/TA/EA/NC/AA/AN/FA/GA/SW/Nelson

 

Device

Total Number Failed to Danger

Flammable Gas

837 (note AA had 237, FA 156, and GA 293)

Toxic

31 (note BC had 14, BD 12 and SW 5)

Oil Mist

29

Flame

243 (note  FA had 150)

Smoke

46 (note FA had 36)

GPA/MAC

102 (note FA had 92)

 

Overall Summary by Review Team on Safety Critical Elements (SCE)

There is evidence of significant shortcomings in both duty holder arrangements for securing SCE integrity and the verification arrangements

 

----------------------------------------------------------------------------------------------------------------------------

Temporary Repairs on pipework – hydrocarbon and non hydrocarbon

 

Specific to Brent Bravo

Service

Approved by Technical Authority

Not approved by Technical Authority

Unknown

30

9

 

Pan Expro

Service

Approved by Technical Authority

Not approved by Technical Authority

Hydrocarbon (205)

132

73

Non-Hydrocarbon (267)

126

141

 

When the ‘not approved’ Temporary Repairs were eventually reviewed by the relevant Technical Authority 8 of these were rejected.

 

109 ‘New’ repairs were entered into the database from 1st October to 14th November 2003

 

Overall Summary by Review Team on Temp Repairs

or a number of assets, there were significant gaps in the overview of temporary repairs in place, and consequently in the overall view of risk