Objective
The purpose of this study was to measure agreement among 5 expert clinicians and a computerized method with the use of a strict fetal heart rate classification method.
Study Design
Five providers independently scored 769 8-minute segments from the last 3 hours of 30 tracings with the use of a 5-tier color-coded framework that contains pattern descriptions and proposals for management. Computer analysis was performed with PeriCALM Patterns (PeriGen, Princeton, NJ) to detect and classify patterns.
Results
The clinicians agreed exactly with the majority opinion in 57% (95% confidence interval [CI], 49–64%) of the segments and were within 1 color code in 89% (95% CI, 81–96%). The average proportion of agreement was 0.83 (95% CI, 0.73–0.94). Weighted Kappa scores averaged 0.58 (range, 0.48–0.68). The computer-based results were not statistically different: 0.87 and 0.52, respectively.
Conclusion
These 5 clinicians achieved moderate-to-substantial levels of agreement overall using a strictly defined method to classify fetal heart rate tracings. The result of the computerized method was similar to the conclusions of these clinicians.
A major limitation of electronic fetal heart rate monitoring (EFM) interpretation has been an unacceptably high inter- and intraobserver variation in interpretation. Such variation hampers the important clinical goals of accurate communication and application of timely management. Recent efforts to lessen the problems of delayed intervention for abnormal tracings have resulted in a variety of methods to categorize tracings and guide management. The rationale underlying this movement is based, in part, on the premise that these explicit definitions will enable clinicians to categorize tracings more consistently.
We chose to evaluate clinical performance using a 5-level classification method. Multiple levels ensure that each level spans a smaller range of severities, compared with a simpler classification in which very disparate subgroups could be grouped together in a level. Multilevel classification methods are useful clinically but are challenging to apply consistently, especially when there are many factors to consider and the task must be done repeatedly under conditions of fatigue and distraction. The classification method was based on 134 different combinations of fetal heart rate (FHR) characteristics. The characteristics of the FHR patterns were defined rigidly, as were the combinations that comprise each level. In addition, each level of the framework was linked to a different proposal for management.
Previous investigators who used a variety of approaches to measure clinician agreement, such as comparing specific characteristics of the tracings (eg, type of deceleration, quantity of variability, or combinations of these features), have shown very poor levels of agreement. Furthermore, none of these studies used such a complex classification schema. Thus, it is very pertinent to determine whether such a classification method actually could help clinicians to achieve consistency in EFM interpretation.
In addition, we sought to determine how well a computerized version of this method would compare to the clinicians. PeriCALM Patterns (PeriGen, Princeton, NJ) is a validated Food and Drug Administration–cleared software package that identifies and measures FHR baseline, baseline variability, and accelerations and decelerations based on the National Institute of Child Health and Human Development definitions. The computerized method with the use of this software and the 5-level classification schema previously was subjected to independent testing for discriminating capacity in a series of 2132 tracings from deliveries that covered a wide range of outcomes. There was a clear correlation between the severity and duration of aberrant FHR patterns and newborn infant state.
Materials and Methods
This multiple reader/multiple case study design included 5 clinical experts, specialized software for FHR analysis, and EFM records from 30 singleton term labors. The cases all had umbilical artery blood gases evaluated at birth and spanned a range of newborn infant outcomes and complexity of FHR patterns. The tracings covered the last 3 hours before birth. They were reproduced in their original size and assembled in booklets with 8 minutes of tracing per page. A total of 769 pages were presented to each clinician. Unknown to the clinicians, 13 of these tracings came from babies with elevated umbilical artery base deficit values at birth (>12 mmol/L) and encephalopathy in the early neonatal period.
Five obstetric providers, 4 perinatologists, and 1 certified nurse midwife, all of whom were clinically active and have been published in FHR monitoring literature, were recruited to score each page in the booklets according to the 5-tier color-coded system. The practitioners were aware of this classification method previously and used it in clinical practice to varying degrees. Each expert was given a detailed set of instructions and a colored worksheet that outlined the 5-tier framework ( Table 1 ). Combinations of the various FHR pattern features (namely baseline rate, variability, accelerations, and decelerations) defined the 5 colors. For example, “green (1)” required all features to be within normal limits. Progressively abnormal combinations of baseline rate, reduction of variability, and increased depth and/or duration of decelerations defined the “blue(2)”, “yellow(3)”, “orange(4),” and “red(5)” categories.
Variability (baseline) | Decelerations | Recurrent variable | Recurrent late | Prolonged | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
None | Early | Mild | Moderate | Severe | Mild | Moderate | Severe | Mild | Moderate | Severe | |
Moderate (normal) | |||||||||||
Tachycardia | B | B | B | Y | O | Y | Y | O | Y | Y | O |
Normal | G | G | G | B | Y | B | Y | Y | Y | Y | O |
Mild bradycardia | Y | Y | Y | Y | O | Y | Y | Y | Y | O | |
Moderate bradycardia | Y | Y | O | O | O | O | |||||
Severe bradycardia | O | O | O | O | O | ||||||
Minimal | |||||||||||
Tachycardia | B | Y | Y | O | O | O | O | R | O | O | R |
Normal | B | B | Y | O | O | O | O | R | O | O | R |
Mild bradycardia | O | O | R | R | R | R | R | R | R | R | R |
Moderate bradycardia | O | O | R | R | R | R | |||||
Severe bradycardia | R | R | R | R | R | ||||||
Absent | |||||||||||
Tachycardia | R | R | R | R | R | R | R | R | R | R | R |
Normal | O | R | R | R | R | R | R | R | R | R | R |
Mild bradycardia | R | R | R | R | R | R | R | R | R | R | R |
Moderate bradycardia | R | R | R | R | R | R | |||||
Severe bradycardia | R | R | R | R | R |