The High-Reliability Pediatric Intensive Care Unit




In health care, reliability is the measurable capability of a process, procedure, or health service to perform its intended function in the required time under actual or existing conditions (as opposed to the ideal circumstances under which they are often studied). This article outlines the current state of reliability in a clinical context, discusses general principles of reliability, and explores the characteristics of high-reliability organizations as a desirable future state for pediatric critical care.


Key points








  • Define high reliability.



  • Describe contexts wherein high reliability is crucial, including critical care.



  • Describe characteristics of high-reliability organizations.



  • Define the characteristics of a high-reliability pediatric intensive care unit.






Introduction to reliability


Principles of organizational and process reliability are used extensively in numerous high-risk and high-tech industries to help compensate for the natural limits of human performance and attention, thereby improving operational performance and safety. Reliability can be defined in several ways. Hollnagel and other engineers have described reliability as the absence of unwanted variance in performance, whereas others have considered it a measure of failure-free operation over time. The importance of reducing variation and failures varies greatly, depending on the outcome at stake. In health care, reliability is the measurable capability of a process, procedure, or health service to perform its intended function in the required time under actual or existing conditions (as opposed to the ideal circumstances under which they are often studied). Reliability is commonly expressed as the inverse of the system’s failure rate, which may be referred to as unreliability. Thus, a process that has a failure or defect rate of 1 in 10 (10%) performs at a reliability level of 10 −1 . Many studies confirm that the majority of health care in the United States operates roughly at this level, although some specific domains in health care have improved to higher orders of reliability. A reliability scale with corresponding real-world and health care examples is outlined in Table 1 . This article outlines the current state of reliability in a clinical context, discusses general principles of reliability, and finally explores the characteristics of high-reliability organizations (HROs) as a desirable future state for pediatric critical care.



Table 1

Levels of reliability




























































Level Reliability Success Opportunities Per Failure Real-World Example Health Care Example
Chaotic <10 −1 <90% <10 Annual mortality if >90 y old Achievement of best-practice processes in outpatient care
1 10 −1 90% 10 Mortality of climbing Mt Everest Achievement of best-practice processes in inpatient care
2 10 −2 99% 100 Mortality of Grand Prix racing Deaths in risky surgery (American Society of Anesthesiologists grades 3–5)
3 10 −3 99.9% 1000 Helicopter crashes Deaths in general surgery
4 10 −4 99.99% 10,000 Mortality of canoeing Deaths in routine anesthesia
5 10 −5 99.999% 100,000 Chartered-flight crashes Deaths from blood transfusions
6 10 −6 99.9999% 1,000,000 Commercial airline crashes

Data from Refs.




Introduction to reliability


Principles of organizational and process reliability are used extensively in numerous high-risk and high-tech industries to help compensate for the natural limits of human performance and attention, thereby improving operational performance and safety. Reliability can be defined in several ways. Hollnagel and other engineers have described reliability as the absence of unwanted variance in performance, whereas others have considered it a measure of failure-free operation over time. The importance of reducing variation and failures varies greatly, depending on the outcome at stake. In health care, reliability is the measurable capability of a process, procedure, or health service to perform its intended function in the required time under actual or existing conditions (as opposed to the ideal circumstances under which they are often studied). Reliability is commonly expressed as the inverse of the system’s failure rate, which may be referred to as unreliability. Thus, a process that has a failure or defect rate of 1 in 10 (10%) performs at a reliability level of 10 −1 . Many studies confirm that the majority of health care in the United States operates roughly at this level, although some specific domains in health care have improved to higher orders of reliability. A reliability scale with corresponding real-world and health care examples is outlined in Table 1 . This article outlines the current state of reliability in a clinical context, discusses general principles of reliability, and finally explores the characteristics of high-reliability organizations (HROs) as a desirable future state for pediatric critical care.



Table 1

Levels of reliability




























































Level Reliability Success Opportunities Per Failure Real-World Example Health Care Example
Chaotic <10 −1 <90% <10 Annual mortality if >90 y old Achievement of best-practice processes in outpatient care
1 10 −1 90% 10 Mortality of climbing Mt Everest Achievement of best-practice processes in inpatient care
2 10 −2 99% 100 Mortality of Grand Prix racing Deaths in risky surgery (American Society of Anesthesiologists grades 3–5)
3 10 −3 99.9% 1000 Helicopter crashes Deaths in general surgery
4 10 −4 99.99% 10,000 Mortality of canoeing Deaths in routine anesthesia
5 10 −5 99.999% 100,000 Chartered-flight crashes Deaths from blood transfusions
6 10 −6 99.9999% 1,000,000 Commercial airline crashes

Data from Refs.




Current state of reliability in clinical care and the pediatric intensive care unit


The health care industry has a fairly woeful track record of reliably delivering contemporary best practice while simultaneously avoiding harm, struggling in the domains of both quality and safety. It is estimated that adults typically receive recommended, evidence-based care about 55% of the time, with little variation among acute, chronic, and preventive care. General pediatric data from a similar analysis suggest performance that is comparable with adult care on average, but more variability exists based on the type of care environment. Children receive an estimated 68% of indicated care for acute medical problems, 53% for chronic medical conditions, and 41% for preventive care. Furthermore, data derived from such performance reviews do not include many errors unrelated to widely accepted best practices, nor those invisible to the measurement methodology. In a survey of pediatric physicians and nurses, half filed incident reports on less than 50% of their own errors, and a third did so less than 20% of the time. It is reasonable to conclude that most medical practitioners are only aware of the tip of the iceberg when it comes to unsafe conditions, near misses, preventable harm, and opportunities for improving the quality and safety of health care.


Although pediatric intensive care unit (PICU)-specific data are scarce, several studies provide support that the intensive care unit (ICU) is no exception. Two prospective studies in the 1990s estimated that iatrogenic adverse complications occurred in 5% to 8% of PICU admissions. In 2000, the Institute of Medicine (IOM) cited the ICU as one of the health care environments most prone to errors and preventable harm. More recently, the use of hospital-wide trigger tools to systematically identify harm in the health care setting demonstrated greater sensitivity than self-reporting and retrospective chart review. This finding was again demonstrated in the PICU setting by Larsen and colleagues, whose study team found approximately 1 preventable adverse event for every 5 patient-days, 3% of which were considered to be serious. In a 20-bed PICU at full occupancy for a year, this would translate to 1416 minor to moderate adverse events and 44 serious ones. So the PICU, like other health care settings, has a great deal of opportunity to improve reliability.


Current efforts by many ICUs to enhance reliability represent a good beginning, but an unacceptable end point. Evidence-based consensus guidelines from professional organizations, when transformed into local protocols and pathways, help to guide care toward best or preferred practices of which there are many now for the PICU practitioner, from traumatic brain injury, to septic shock, to the prevention of central-line infections. Beyond basic standardization, Gawande and Pronovost and colleagues have championed the use of checklists to help manage some of the complexity of modern medicine. Identifying a minimum set of high-value practices immediately before an action or in real time during a process has helped prevent errors of ineptitude (ie, the failure to apply knowledge) and mitigate errors of ignorance (ie, the absence of knowledge). Protocolized care and checklists reduce unnecessary variation, which can promote predictability and thereby reduce errors of communication, teamwork, and supervision (ie, the lack thereof). Improved performance reliability through the use of checklists has been demonstrated in many ICUs, including PICUs, with examples of reducing central-line infections, reducing ventilator-associated pneumonias, and increasing daily harm-prevention practices, to name but a few.


However, in achieving highly reliable health care, there are some problems with standardized care, use of checklists, implementation of protocols, and automation of decision-support tools. Experienced by many providers as “cookbook medicine,” such tools do not always accommodate the wide range of practitioner or team experience nor facilitate innovation and creative problem solving in the face of unusual or unexpected circumstances. As the old NASA saying goes about astronauts, “There are two ways to die in space: (1) not following the procedure exactly as written and (2) following the procedure exactly as written.” For PICUs to achieve high reliability, we will have to transcend the implementation of isolated tools or techniques and instead arrive at a state of continuous, mindful organizing whereby the quality of organizational attention enables a coordinated capacity to perceive, understand, and act on opportunities to prevent, intercept, mitigate, and learn from all undesired phenomena.




Principles of reliability


The Institute for Healthcare Improvement (IHI) has put forward a stepwise model for applying principles of reliability to health care systems: prevention, identification/mitigation, and redesign. Prevention strategies, such as Failure Modes and Effects Analysis, are the furthest upstream, but eliminating latent vulnerabilities in complex and interdependent systems can become more theoretical than empirical, and cannot anticipate all possibilities. Process simplification and process control reduce opportunities for failures to occur or propagate downstream, and are common methods used in the current quality and safety movement in American health care, as evidenced by lean engineering as well as bundles and checklists. However, situations do not always unfold the same way each time, so each opportunity for a failure is slightly different. This aspect requires that performance adapts in real time to achieve reliability. So in a loosely coupled system such as health care, prevention strategies have significant shortcomings, and increasing attention is being placed on effective identification and mitigation, as evidenced by contemporary work around failure to rescue. Identification of failures and emerging risk is crucial, but insufficient for situational awareness to be created. Yet even when problems are perceived there are issues with misperception, misconception, and misunderstanding, confounding the ability to adapt to or invent countermeasures for harm propagating through the system. Fig. 1 outlines a range of strategies that are targeted at different levels where risk and harm emerge in complex systems.




Fig. 1


Flow of defects in systems and parallel methods to address them.


The IHI also describes levels of reliability to help distinguish design characteristics of systems, and a proposed grouping of such traits is outlined in Table 2 . It should be noted that occurrence rates of failed processes (eg, compliance with best practice) and catastrophic outcomes (eg, death) can differ by orders of magnitude. Reliability of 10 −1 or less typically represents systems whereby there is no articulated common process and reliability strategies are no more sophisticated than training and reminders. Performance relies on individuals’ intent, vigilance, and hard work. Reliability of 10 −2 on key process measures is more consistent with systems using intentionally designed quality and safety tools as well as evidence-based procedures that are implemented using principles of human factors engineering. Reliability of 10 −3 or better on key process measures typically reflects well-designed systems with attention to structure, processes, context, human psychology, and their collective relationship to performance and outcomes.



Table 2

System characteristics as they relate to differing levels of reliability












Low Reliability (Generally More Basic and Inconsistent) ← Reliability → High Reliability (Generally More Robust and Effective)
Individual preference prevails
Intent to perform well
Individual excellence rewarded
Human vigilance for risk, error, harm
Hard work, trying harder after failures
Codified policies, procedures, guidelines
Personal checklists
Retrospective performance feedback
Didactic training/retraining
Awareness-raising
Basic standardization (equipment, brands, forms)
Personnel informed by reliability science
Implementation of human factors
Standardization of processes is the norm
Ambiguities in standard work eliminated
“Work-around” solutions eliminated
Reminders and decision support built-in
Standard checklists (real-time compliance)
Good habits/behaviors leveraged
Error-proofing: warnings, sensory alerts
Deliberate redundancy in critical steps
Key tasks are scheduled/assigned
Some simulation training for emergencies
Real-time performance feedback
Sophisticated organizational design
Integrated hierarchies, processes, teams
Error-proofing: forced function, shutdown
Failure modes and effects analysis
Routine simulation for training/reinforcing
Strong teamwork climate
Strong safety culture
Staff perception of psychological safety
Preoccupation with failure
Reluctance to simplify interpretations
Sensitivity to operations
Deference to expertise
Commitment to resilience

Data from Refs.




High-reliability organizations


During the growing movement of quality improvement and patient safety in health care over the last 2 decades, there has been increasing interest in adapting successful safety and quality models from other industries, such as lean engineering, Six Sigma process control, and failure mode analysis. Similarly, clinical enterprises have become increasingly interested in applying principles of HROs; namely, those organizations operating in complex and high-risk industries where errors and accidents are expected, yet where harm and catastrophe are effectively anticipated and mitigated in nearly all cases. It is fair to state that no health care enterprise has fully implemented all aspects of an HRO nor realized their spectacularly low incidence of harm from human errors and system failures. Whether the vexing complexity of health care can be conquered through HRO principles is yet to be determined, but the sound principles surely move us closer to the goal of medical care free from preventable harm. Indeed, high reliability has come to be a phrase not only of high-reliability achieving organizations, but high-reliability seeking organizations, to which all ICUs ought to aspire.


The early characterization of HROs in the 1980s was motivated by observations of a few American industries that seemed to defy normal accident theory and human reliability, such as flight decks of aircraft carriers, commercial aviation, and nuclear power plants. Before the advent of HRO models, normal accident theory (NAT) represented the dominant thinking about how disasters occur in complex sociotechnical systems despite high vigilance and explicit countermeasures, such as with nuclear power plant meltdowns. In NAT, processes that are tightly coupled, time dependent, and have little slack or flexibility can allow impending aberrant events to propagate invisibly and with sufficient complexity so as to make real-time comprehension impossible. NAT pessimistically argues that in such systems, accidents are inevitable or “normal” regardless of management and operational effectiveness. This attitude is akin to the prevalent mindset in health care whereby hospital-acquired conditions and deaths from errors are fatalistically viewed as just “a cost of doing business.” By contrast, HROs take a more optimistic perspective about continuously seeking effective management of inherently complex and risky systems through organizational control of both hazard and probability, primarily through collective mindfulness.


Numerous industries are faced with the need to regularly perform complex and inherently hazardous tasks dependent on highly technical skill sets in time-critical and unforgiving contexts. However, unlike most other industries (including health care), HROs’ devotion to a zero rate of failure has been nearly matched by performance. Some HRO industries, such as the Federal Aviation Administration’s Air Traffic Control system, have used strategies very familiar to medical professionals: extensive and continuous training, selection of individuals with fitting aptitude, performance auditing, and cumulative team experience. But some HRO industries, such as US naval aircraft operations, seem to fly in the face of conventional wisdom, such as the launching and landing of multi-million dollar warbirds on a pitching deck of a carrier at sea in the dark that is faithfully executed by a young, inexperienced, and transient crew.


But whereas HROs tend to be high risk and high reliability, health care tends to be high risk and low reliability. Cook and Rasmussen have described a dynamic safety model that helps contrast high- and low-reliability organizations ( Fig. 2 ). In this model, complex sociotechnical systems operate within a zone bounded by certain forces such as economic failure, unacceptable workload, and unacceptable performance. As unreliability increases, the probability of operations unintentionally crossing one of these boundaries also increases, leading to financial losses, workforce overload, and/or adverse events. Compared with high-risk low-reliability organizations, HROs can safely exist in the high-risk zone near the boundary of harm. In part this is because performance is so reliable that there is little likelihood of operations violating intended thresholds, but also in part because when operations slip or drift in the direction toward harm, HROs are aware and adaptive enough to intercept or mitigate threatening events and circumstances. It must be noted that high reliability in performance and outcomes does not mean invariance in processes or procedures. On the contrary, lack of flexibility can create brittle systems. What requires the upmost consistency in an HRO is the collective cognitive process that makes sense of events and changing circumstances. Managing unexpected events requires revised inputs and tactics that are predicated on a shared understanding and prioritization of safety. What must be reliable is the detection of threats and errors; what must remain flexible is the adaptive response to the unexpected.


Oct 2, 2017 | Posted by in PEDIATRICS | Comments Off on The High-Reliability Pediatric Intensive Care Unit

Full access? Get Clinical Tree

Get Clinical Tree app for offline access