Interobserver and intraobserver reliability of the NICHD 3-Tier Fetal Heart Rate Interpretation System




Objective


Our purpose was to test the reliability of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) 3-Tier Fetal Heart Rate (FHR) classification system.


Study Design


Individual 15- to 20-minute FHR segments (n = 154) were independently reviewed without clinical data by 3 maternal-fetal medicine examiners and classified by NICHD category (I, II, III).


Results


Interobserver reliability was moderate (kappa 0.45) and varied by NICHD category (category I moderate [kappa 0.48], category II moderate [kappa 0.44], and category III poor [kappa 0.0]). The intraobserver agreement ranged from substantial to perfect (kappa 0.74-1.0).


Conclusion


Interobserver agreement of 3-Tier FHR classification System was moderate for NICHD categories I and II. Agreement for category III tracings was poor mainly due to lack of agreement regarding absent vs minimal variability.


In April 2008, the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), American College of Obstetricians and Gynecologists, and Society for Maternal-Fetal Medicine cosponsored a workshop on electronic fetal monitoring (EFM) that recommended a new 3-Tier Fetal Heart Rate (FHR) classification system for use in the United States. One purpose for creating a new classification system was the potential ability to develop “evidence-based clinical management strategies of intrapartum fetal compromise.”


Another potential benefit of a 3-Tier classification system may be improved agreement of FHR interpretation between different observers. Given that the new FHR classification system (eg, the number of categories and the FHR patterns to be included in each category) was largely based on consensus opinion of the members attending the workshop, this system is untested with respect to its reliability, validity, and effectiveness. The purpose of our study was to assess the interobserver and intraobserver reliability of the NICHD 3-Tier FHR classification system. We hypothesized that the reliability of the 3-Tier system would be greatest with more “normal” and “very abnormal” FHR patterns.


Materials and Methods


A computerized perinatal database was used to identify women who delivered ≥37 weeks 0 days’ gestation from Jan. 1, 2008 through Dec. 31, 2009 at an institution that performed universal umbilical cord blood gas analysis on all deliveries. Cases with planned cesarean delivery or absence of umbilical artery (UA) blood gas results were excluded. Chart review was performed to obtain relevant clinical and outcome data. FHR tracings were examined (by one of the investigators not involved in FHR interpretation) to ensure there was adequate tracing for review and that the time from last EFM period to delivery was <30 minutes. To evaluate a broad range of FHR patterns, we selected cases from 3 groups of subjects based on UA pH at delivery (UA pH >7.10, 7.00-7.10, and <7.00 with base excess <–12 mEq). Based on review of the sample sizes of prior studies in the literature, as well as logistical issues with the time and effort associated with selection and deidentification of FHR tracing segments, we chose to evaluate 120 FHR tracings segments from 40 women (n = 15 with UA pH >7.10, n = 15 with UA pH 7.00-7.10, and n = 10 with UA pH <7.00 with base excess <–12 mEq).


FHR tracings were printed from an archived EFM system onto 11×8–in paper. Three FHR segments were selected for each case; each segment represented a 15- to 20-minute epoch. One segment was chosen from the last 60 minutes prior to birth and the other 2 segments were randomly selected from the last 180 minutes prior to birth (no segment overlapped). Approximately 1 in 4 FHR segments were randomly chosen for duplication to assess intraobserver reliability. FHR segments (total = 154 FHR; n = 120 original, n = 34 duplicate) were deidentified and placed in random order. They were given in bulk at one time to each reviewer.


An interactive training session was performed prior to FHR tracing review. The EFM workshop summary was discussed and nonstudy “training” FHR tracings were collaboratively examined to achieve consensus on definitions and criteria for each category. Three maternal-fetal medicine (MFM) board-certified practitioners who participated in the NICHD EFM workshop (S.C.B., W.G., C.G.B.) independently reviewed the FHR tracings without clinical information. The 3 MFM reviewers had completed fellowship training within the last 5-10 years and as part of their routine clinical practice reviewed FHR tracings while caring for intrapartum patients. Laminated cards of the NICHD classification system were used during review. A structured data collection instrument was utilized that included assessment of NICHD category (I, I, III), FHR baseline, presence/absence of accelerations, decelerations (early, variable, late), and FHR variability (absent, minimal, moderate, or marked). FHR variability was assessed by visual interpretation. Examiners could also describe an FHR tracing as “uninterpretable.”


Statistical analysis was performed with SPSS version 19.0 (SPSS, Inc, Chicago, IL). Cohen kappa was used to assess interobserver and intraobserver reliability (ie, to assess level of agreement beyond chance). Predefined criteria for agreement were used: kappa 0.0-0.20 (poor), 0.21-0.40 (fair), 0.41-0.6 (moderate), 0.61-0.8 (substantial), and 0.81-1.0 (almost perfect). A P value < .05 was considered significant. This study was submitted to our local institutional review board for review and was determined to qualify for exempt status.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Jun 4, 2017 | Posted by in GYNECOLOGY | Comments Off on Interobserver and intraobserver reliability of the NICHD 3-Tier Fetal Heart Rate Interpretation System

Full access? Get Clinical Tree

Get Clinical Tree app for offline access