Objective
To determine the accuracy of changes in transvaginal sonographic cervical length over time in predicting preterm birth in women with singleton and twin gestations.
Data Sources
PubMed, Embase, Cinahl, Lilacs, and Medion (all from inception to June 30, 2015), bibliographies, Google scholar, and conference proceedings. Cohort or cross-sectional studies reporting on the predictive accuracy for preterm birth of changes in cervical length over time.
Study Appraisal and Synthesis Methods
Two reviewers independently selected studies, assessed the risk of bias, and extracted the data. Summary receiver-operating characteristic curves, pooled sensitivities and specificities, and summary likelihood ratios were generated.
Results
Fourteen studies met the inclusion criteria, of which 7 provided data on singleton gestations (3374 women) and 8 on twin gestations (1024 women). Among women with singleton gestations, the shortening of cervical length over time had a low predictive accuracy for preterm birth at <37 and <35 weeks of gestation with pooled sensitivities and specificities, and summary positive and negative likelihood ratios ranging from 49% to 74%, 44% to 85%, 1.3 to 4.1, and 0.3 to 0.7, respectively. In women with twin gestations, the shortening of cervical length over time had a low to moderate predictive accuracy for preterm birth at <34, <32, <30, and <28 weeks of gestation with pooled sensitivities and specificities, and summary positive and negative likelihood ratios ranging from 47% to 73%, 84% to 89%, 3.8 to 5.3, and 0.3 to 0.6, respectively. There were no statistically significant differences between the predictive accuracies for preterm birth of cervical length shortening over time and the single initial and/or final cervical length measurement in 8 of 11 studies that provided data for making these comparisons. In the largest and highest-quality study, a single measurement of cervical length obtained at 24 or 28 weeks of gestation was significantly more predictive of preterm birth than any decrease in cervical length between these gestational ages.
Conclusions
Change in transvaginal sonographic cervical length over time is not a clinically useful test to predict preterm birth in women with singleton or twin gestations. A single cervical length measurement obtained between 18 and 24 weeks of gestation appears to be a better test to predict preterm birth than changes in cervical length over time.
Preterm birth is one of the “great obstetrical syndromes,” which are characterized by multiple etiologies, a long preclinical stage, frequent fetal involvement, clinical manifestations that are often adaptive in nature, and complex interactions between the fetal and maternal genome and the environment that may predispose to the syndrome.
Transvaginal sonographic measurement of cervical length (CL) provides useful information about one of the mechanisms of disease implicated in the etiology of the preterm parturition syndrome. In 1990, Andersen et al published a seminal study in which a transvaginal sonographic CL below the 50th percentile at 30 weeks of gestation was associated with a 3.7-fold increased risk of preterm birth at <37 weeks of gestation compared with a CL at or above the 50th percentile. Logistic regression analysis showed a progressive and statistically significant trend toward a higher risk of preterm birth with a shorter CL. Moreover, it was reported that a CL <39 mm had a sensitivity of 76% and a specificity of 59% to predict preterm birth at <37 weeks of gestation. Since then, transvaginal sonographic CL has been extensively investigated as a predictor of preterm birth. Several metaanalyses have now provided compelling evidence that a transvaginal sonographic CL measurement at 18–24 weeks of gestation is one of the strongest and most consistent predictors of preterm birth in asymptomatic women with singleton gestations, regardless of whether they have a history of preterm birth, and twin gestations.
More recently, an analysis of serial measurements of transvaginal sonographic CL has shown that assessment of risk for preterm birth can be further refined. Several studies have reported that the shortening of transvaginal sonographic CL over time is associated with an increased risk of preterm birth, whereas other studies have not been able to demonstrate this association. Recently there has been a renewed interest in the relationship between CL changes over time and the risk of preterm birth. The shortening of transvaginal sonographic CL over time has been proposed as a better predictor of spontaneous preterm birth than a single CL measurement. However, to the best of our knowledge, there are no studies that have systematically evaluated the predictive performance of this test.
The primary aim of this study was to determine the accuracy of changes in transvaginal sonographic CL over time to predict preterm birth in women with singleton and twin gestations through the use of formal methods for systematic reviews and metaanalytic techniques.
Materials and methods
This study followed a prospectively prepared protocol and is reported in accordance with recommended methods for systematic reviews of diagnostic test accuracy. The 2 authors independently retrieved and reviewed studies for eligibility, assessed their risk of bias, and extracted data. All disagreements encountered in the review process were resolved through consensus.
Data sources and searches
To identify potentially eligible studies, we searched PubMed, Embase, Cinahl, Lilacs, and Medion (all from inception to June 30, 2015) using an existing literature search strategy for systematic reviews of predictive tests for preterm birth. Google Scholar, proceedings of congresses on preterm birth, ultrasound in obstetrics and maternal-perinatal medicine, and reference lists of identified studies were also searched. No language restrictions were applied.
Eligibility criteria
The systematic review focused on cohort or cross-sectional studies that reported on the accuracy of changes in transvaginal sonographic CL over time to predict preterm birth in asymptomatic pregnant women with a singleton or twin gestation, and that allowed a construction of 2×2 contingency tables. Studies were excluded if they had the following characteristics: (1) were case-control studies because there is consistent evidence that they are associated with higher diagnostic or predictive accuracy compared with cohort studies ; (2) assessed CL changes over time in women with cervical cerclage or pessary, preterm labor, premature rupture of membranes, or those who were receiving progestogens; (3) were reviews, case series or reports, editorials, or letters without original data; or (4) did not publish accuracy test estimates and sufficient information to calculate them could not be retrieved. For studies that resulted in multiple publications, the data from the one with the largest sample size were used and supplemented if additional information appeared in the others.
Reference standard outcomes
The reference standard outcomes included the following: in women with singleton gestations, spontaneous preterm birth at <37 and <35 weeks of gestation; in women with twin gestations, spontaneous preterm birth at <34, <32, <30, and <28 weeks of gestation.
Assessment of risk of bias
Study quality was assessed using a modified version of the Quality Assessment of Diagnostic Accuracy Studies-2 tool. The assessments were judged as low risk, high risk, or unclear risk of bias. The items were evaluated and interpreted as follows:
- 1.
Patient selection. Low risk of bias: women were consecutively or randomly selected; high risk of bias: convenience sampling (arbitrary recruitment or nonconsecutive recruitment).
- 2.
Description of the test. Low risk of bias: the study described sufficient details of the technique used for measuring CL such as the plane in which images were obtained, anatomic references for the determination of CL, and number of measurements; high risk of bias: if this information was not reported.
- 3.
Reference standard. Low risk of bias: spontaneous preterm birth, defined as a preterm delivery after the spontaneous onset of contractions or preterm premature rupture of membranes, regardless of whether the delivery was vaginal, by cesarean delivery, or, in the case of rupture of membranes, induced; high risk of bias: inclusion of both spontaneous and indicated preterm birth in the reference standard.
- 4.
Blinding. Low risk of bias: the study clearly stated that clinicians managing the patient did not have knowledge of the CL measurement results; high risk of bias: unmasking of clinicians to test results.
- 5.
Inclusion of women in the analysis. Low risk of bias: if at least 90% of women recruited into the study were included in the analysis; high risk of bias: if less than 90% of women recruited into the study were included in the analysis.
- 6.
Use of interventions aimed to prevent preterm birth based on the test results. Low risk of bias: clinicians did not use interventions based on the results of the CL measurements; high risk of bias: clinicians used interventions based on the results of the test (eg, cerclage, pessary, vaginal progesterone).
If there was insufficient information available to make a judgment about these items, then they were scored as unclear risk of bias. We did not calculate a summary score estimating the overall quality of each study because of the well-known problems associated with such scores.
Data extraction
Data were extracted from each article using a specifically designed form for capturing information on study characteristics, patient characteristics (inclusion and exclusion criteria, risk classification for preterm birth, sample size, plurality of pregnancy, and demographics), risk of bias, how the test was carried out (technique used for measuring CL, gestational age at testing, and cutoff values used for single CL measurements and CL changes over time), and reference standard outcomes.
For each study, for all reported cutoff values for single CL measurements and CL changes over time, and for all categories of preterm birth, we then extracted the number of true-positive, false-positive, true-negative, and false-negative test results. When predictive accuracy data were not available, we recalculated them from the reported results including scatterplot graphs.
Data were extracted separately for singleton (unselected population and low and high risk for preterm birth) and twin gestations and for each reference standard outcome assessed. Studies that reported preterm birth at <36 weeks of gestation were grouped with those that reported preterm birth at <37 weeks of gestation, and those reporting preterm birth at <35 weeks of gestation were considered alongside studies reporting preterm birth at <34 weeks of gestation.
Data synthesis
Data from individual studies were synthesized separately for singleton and twin gestations and stratified according to the predefined reference standard outcomes, regardless of the gestational ages at which CL was measured and cutoff values used to define shortening of CL. For singleton gestations, we synthesized data for all women, those at high risk for preterm birth, those at high risk for preterm birth with an initial normal CL, those at low risk for preterm birth, and those from unselected populations.
Data extracted from each study were arranged in 2×2 contingency tables. When any single cell in these tables contained a zero, we added 0.5 to each cell to enable calculation of likelihood ratios (LRs) and confidence intervals (CIs). Sensitivity and specificity with 95% CIs were calculated separately for all reported cutoff values and reference standard outcomes reported. We then constructed summary receiver-operating characteristic (ROC) curves by means of a bivariate random-effects approach and calculated area under the summary ROC curves with their corresponding 95% CIs. A 2-sided P < .05 was considered to be statistically significant. We used random-effects bivariate regression models to analyze the logit-transformed sensitivity and specificity to obtain pooled estimates and 95% CIs of these variables. Thereafter we calculated LR with 95% CIs from the pooled sensitivities and specificities for each reference standard outcome considered. LRs for a positive test result above 10 and LRs for a negative test result below 0.1 are considered to provide strong predictive evidence in most circumstances. Moderate prediction can be achieved with LRs values of 5-10 and 0.1-0.2, whereas those less than 5 and greater than 0.2 give only minimal prediction.
We assessed the heterogeneity of the results among studies through visual examination of forest plots of sensitivities and specificities and by means of the quantity I 2 . A substantial level of heterogeneity was defined as an I 2 of 50% or greater. We explored potential sources of heterogeneity by performing a metaregression analysis of subgroups defined a priori (study’s risk of bias, gestational ages at testing, cutoff values used, sample size, prevalence of the reference standard outcome, and setting). We planned to assess publication and location biases, but this was not performed because there were fewer than 10 studies in each metaanalysis. Finally, we compared the predictive accuracy for spontaneous preterm birth of the initial and/or final CL measurements and the changes in CL over time in individual studies that provided this information. When comparing the performance of 2 predictive tests, it is more convenient to summarize the predictive accuracy with one single overall measurement. We calculated the Youden index for the initial and/or final CL measurements and the shortening of CL over time in each study. This index is formally defined as sensitivity plus specificity minus 1 , and its value ranges from 0 for a useless test to 1 for an ideal test. A Z -score test was then used to estimate the statistical significance of the difference between the Youden index of shortening of CL over time and that of the initial or final CL measurement. A 2-sided P < .05 was considered to be statistically significant.
All statistical analyses were performed using SAS version 9.2 (SAS Institute Inc, Cary, NC) and Review Manager (RevMan) version 5.3.5 (The Nordic Cochrane Centre, København, Denmark).
Results
Selection, characteristics, and quality of studies
A total of 1732 citations were identified, of which 105 were retrieved for full-text review. Of these, 91 were excluded, mainly because they did not provide data on CL changes over time, were not a test accuracy study, were duplicate publications, or provided insufficient data to construct 2×2 tables ( Figure 1 ). Fourteen studies, including a total of 4398 women, met the inclusion criteria, of which 6 provided data on women with singleton gestations (n = 3236), 7 on women with twin gestations (n = 871), and 1 on women with singleton (n = 138) and twin (n = 153) gestations.
The main characteristics of the included studies are listed in Table 1 . Five studies were conducted in the United States, 4 in European countries, 3 in Asia, and 1 each in Canada and Brazil. The sample size ranged from 68 to 2531 in women with singleton gestations and from 20 to 209 in women with twin gestations. Among the 7 studies performed in women with singleton gestations, 5 included exclusively women at high risk for preterm birth, 1 included only women at low risk, and the remainder included an unselected population. Prophylactic cerclage and major fetal anomalies were reported as exclusion criteria in half of the included studies.
Study, year (country) | Number of women | Inclusion criteria | Exclusion criteria | Gestational ages at testing, wks | Abnormal test result | Reference standard outcome |
---|---|---|---|---|---|---|
Iams et al, 1996 (United States) | 2531 | Singleton gestation (unselected population) | Multiple gestation, cerclage, placenta previa, major fetal anomaly | 24 (range, 22–24) and 28 (range, 26–29) | Any decrease in CL; shortening of CL ≥6 mm | Spontaneous preterm birth <35 wks |
Berghella et al, 2003 (United States) | 173 a | Singleton gestation at high risk (≥1 previous spontaneous preterm births between 14 and 34 wks or ≥2 dilatation and curettage procedures, Müllerian anomaly, cone biopsy, diethylstilbestrol exposure) with an initial CL ≥25 mm | Placenta previa, current drug abuse, severe fetal anomalies | 10–13 and then every 2–4 wks up to 23 wks 6 days’ gestation | Shortening of CL to <25 mm at 14–24 wks | Spontaneous preterm birth <35 wks |
Bergelin and Valentin, 2003 (Sweden) | 20 | Twin gestation | Pregnancy complications | 24 and 28 | Shortening of CL ≥20% | Spontaneous preterm birth <34 wks |
Gibson et al, 2004 (United Kingdom) | 91 | Twin gestation | Twin-to-twin transfusion syndrome, fetal anomalies | 18, 24, and 28 | Shortening of CL ≥2.5 mm/wk between 18 and 28 wks’ gestation | Spontaneous preterm birth <35 wks |
Owen et al, 2004 (United States) | 137 b | Singleton gestation at high risk (previous spontaneous preterm birth before 32 wks of gestation) with an initial CL >30 mm | Chronic medical or obstetric conditions, history of substance abuse, uterine anomalies, history-indicated cerclage | 16–18 and then every 2 wks up to 23 wks 6 days’ gestation | Shortening of CL to ≤30 mm at 19–24 wks | Spontaneous preterm birth <35 wks |
Arabin et al, 2006 (The Netherlands) | 291 | Singleton gestation at high risk (previous spontaneous preterm birth or uterine anomaly; n = 138); and twin gestation (n = 153) | Unreported | 15–19 and 20–24 | Shortening of CL >5, >10, and >15 mm c | Spontaneous preterm birth <36 wks |
Fox et al, 2007 (United States) | 68 | Singleton gestation with a CL ≤25 mm at 16–28 wks of gestation and expectant management | Cerclage | 16–28 (median, 22) and within 3 wks of the initial measurement (median, 23) | Any decrease in CL | Preterm birth <34 and <37 wks |
Dilek et al, 2007 (Turkey) | 257 | Singleton gestation (low risk) | History of preterm birth, preterm PROM, cervical incompetence, multiple gestation, previously detected cervical funneling, Müllerian anomalies | 16 and 24 | Shortening of CL ≥6.6 mm | Spontaneous preterm birth <37 wks |
Fox et al, 2010 (United States) | 121 | Twin gestation | Monoamniotic twins, major fetal anomalies, preterm labor, aneuploidy, twin-to-twin transfusion syndrome | 18–24 and within 2–6 wks of the initial measurement (98% at or before 25 wks) | Shortening of CL ≥20% | Spontaneous preterm birth <32, <28, <30, and <34 wks |
Hofmeister et al, 2010 (Brazil) | 124 | Twin gestation | Monoamniotic twins, twin-to-twin transfusion syndrome, polyhydramnios, intrauterine fetal death, fetal malformation, iatrogenic preterm birth | 18–21 and 22-25 | Shortening of CL >2 mm/wk | Spontaneous preterm birth <34, <28, <30, and <32 wks |
Crane and Hutchens, 2011 (Canada) | 70 | Singleton gestation at high risk (previous spontaneous preterm birth or excisional cervical procedure, or uterine anomaly) with a CL <30 mm at 20–28 wks of gestation | Cerclage | 20-28 (mean, ∼26) and within 3 wks of the initial measurement (mean, ∼28) | Shortening of CL >10% | Spontaneous preterm birth <35, <37, <34, and <32 wks |
Oh et al, 2012 (Korea) | 190 | Twin gestation with a CL >25 mm at 20–24 wks of gestation | Prophylactic cerclage, PROM, preterm labor, major fetal anomalies, twin-to-twin transfusion syndrome, placenta previa, monoamniotic placenta | 20–24 (mean, 21.9) and within 4–5 wks of the initial measurement (mean, 26.0) | Shortening of CL ≥13% and ≥20% | Spontaneous preterm birth <32 and <34 wks |
Khalil et al, 2013 (Saudi Arabia) | 209 | Twin gestation with a CL >25 mm at first measurement | CL ≤25 mm on the first ultrasound, monoamniotic twins, preterm labor with or without PROM, abnormal vaginal discharge, severe twin-to-twin transfusion syndrome, aneuploidy, major fetal anomalies, elective cerclage | 20–23 and within 3–5 wks of the initial measurement | Shortening of CL ≥25% | Spontaneous preterm birth <32, <34, <30, and <28 wks |
Levêque et al, 2015 (France) | 116 | Twin gestation | Prophylactic cerclage, placenta previa, major fetal anomalies, twin-to-twin transfusion syndrome, PROM, undetermined gestational age | 22 (range, 21–23) and 27 (range, 26–28) | Shortening of CL ≥20% | Spontaneous preterm birth <34 wks |
a From the 183 women included in the study, we excluded 10 with an initial CL of <25 mm without data on CL changes over time
b From the 183 women included in the study, we excluded 46 with an initial CL of ≤30 mm without data on CL changes over time
c In the metaanalyses performed, we used predictive values for shortening in CL >10 mm with women in recumbent position.
The gestational ages at initial and final CL measurement ranged from 10 to 28 weeks and 20 to 30 weeks, respectively. The initial CL measurement was carried out at 10–13 weeks in 1 study, at 14–19 weeks in 7 studies, at 20–24 weeks in 5 studies, and at 25–30 weeks in 1 study. The final CL measurement was carried out at 20–24 weeks in 6 studies and at 25–30 weeks in 8 studies. The test result was considered abnormal if there was a shortening of the CL over time ≥20% or ≥25% or ≥13% or >10% or >2 mm/wk or ≥2.5 mm/wk or ≥6.6 mm or ≥6 mm or >5, >10, and >15 mm. Two studies that included women with an initial CL ≥25 mm or >30 mm used a shortening of CL to <25 mm or ≤30 mm at follow-up transvaginal sonography to indicate an abnormal test result. Any decrease in CL defined abnormality in 2 studies. Seven studies used ROC curve analysis to determine the optimal cutoff value for defining an abnormal change in CL over time. The remaining studies used arbitrary cutoff values to define an abnormal test result. Twelve studies provided data on the predictive accuracy of shortening in CL over time for preterm birth at <34 or <35 weeks of gestation, 4 each on preterm birth at <37 and <32 weeks of gestation, and 3 each on preterm birth at <30 and <28 weeks of gestation.
The risk of bias in each included study is shown in Figure 2 . Five studies (36%) fulfilled all 6 criteria. Nine studies (64%) had 2 or more methodological flaws. The most common deficiencies were related to blinding of clinicians to the test results and the use of interventions aimed to prevent preterm birth based on the test results.
Predictive accuracy for preterm birth in singleton and twin gestations
Figure 3 shows the summary ROC curves of changes in CL over time to predict spontaneous preterm birth. The shortening of CL over time had a higher predictive accuracy for preterm birth among women with twin gestations with areas under the summary ROC curves of 0.81 (95% CI, 0.74–0.89) for preterm birth at <32 weeks of gestation and 0.76 (95% CI, 0.69–0.83) for preterm birth at <34 weeks of gestation. Among women with singleton gestations, the areas under the summary ROC curves to predict preterm birth at <35 and <37 weeks of gestation were 0.64 (95% CI, 0.56–0.72) and 0.71 (95% CI, 0.62–0.80), respectively. The sensitivity and specificity of any shortening of CL to predict preterm birth in the single studies are shown in Figure 4 .
Pooled estimates of accuracy of CL changes over time for the prediction of spontaneous preterm birth are presented in Table 2 . Overall, regardless of the risk status of women, reference standard outcome assessed and definition of an abnormal test result, the predictive ability of shortening of CL over time for preterm birth was low among women with singleton gestations (pooled sensitivities and specificities, and summary positive and negative LRs ranging from 49–74%, 44–85%, 1.3–4.1, and 0.3–0.7, respectively) and low to moderate among women with twin gestations (pooled sensitivities and specificities, and summary positive and negative LRs ranging from 47–73%, 84–89%, 3.8–5.3, and 0.3–0.6, respectively). In women with singleton gestations, the summary positive and negative LRs of any shortening of CL to predict preterm birth at <37 and <35 weeks of gestation were 3.2 and 0.6 and 1.3 and 0.7, respectively. In women with twin gestations, the summary positive and negative LRs of any shortening of CL to predict preterm birth at <34 and <32 weeks of gestation were 4.0 and 0.6 and 5.3 and 0.5, respectively.
Population | Outcome | Abnormal test result | No. of studies/total number of women | Sensitivity, % (95% CI) | Specificity, % (95% CI) | Positive likelihood ratio (95% CI) | Negative likelihood Ratio (95% CI) | I 2 , % |
---|---|---|---|---|---|---|---|---|
Singleton gestation (all) | Preterm birth <37 wks | Any shortening of CL | 4 /532 | 54 (43–65) | 84 (80–87) | 3.2 (2.2–4.4) | 0.6 (0.4–0.9) | 0 |
Preterm birth <35 wks | Any shortening of CL | 5 /2979 | 65 (57–72) | 48 (46–50) | 1.3 (1.1–1.4) | 0.7 (0.6–0.9) | 25 | |
Singleton gestation at high risk | Preterm birth <37 wks | Any shortening of CL | 3 /275 | 49 (37–61) | 85 (80–90) | 3.3 (2.2–5.0) | 0.6 (0.5–0.8) | 0 |
Preterm birth <35 wks | Any shortening of CL | 4 /448 | 58 (47–68) | 72 (67–76) | 2.0 (1.6–2.6) | 0.6 (0.4–0.8) | 12 | |
Singleton gestation at high risk with an initial normal CL | Preterm birth <35 wks | Shortening of CL to <25 or <30 mm at ≤24 wks | 2 /310 | 56 (43–68) | 72 (66–77) | 2.0 (1.5–2.7) | 0.6 (0.4–0.8) | 63 |
Singleton gestation (unselected) | Preterm birth <35 wks | Any decrease in CL | 1 /2531 | 72 (62–81) | 44 (42–46) | 1.3 (1.1–1.5) | 0.6 (0.4–0.9) | NA |
Singleton gestation at low risk | Preterm birth <37 wks | Shortening of CL ≥6.6 mm | 1 /257 | 74 (51–88) | 82 (77–86) | 4.1 (2.8–6.0) | 0.3 (0.1–0.7) | NA |
Twin gestation | Preterm birth <34 wks | Any shortening of CL | 7 /852 | 47 (39–55) | 88 (86–91) | 4.0 (2.0–8.3) | 0.6 (0.5–0.8) | 66 |
Shortening of CL ≥20–25% | 5 /651 | 47 (38–57) | 87 (84–90) | 3.8 (2.8–5.1) | 0.6 (0.5–0.7) | 76 | ||
Preterm birth <32 wks | Any shortening of CL | 4 /644 | 61 (49–73) | 88 (85–91) | 5.3 (3.4–7.2) | 0.5 (0.3–0.6) | 0 | |
Shortening of CL ≥20–25% | 3 /520 | 56 (42–69) | 89 (86–92) | 5.2 (3.7–7.5) | 0.5 (0.4–0.7) | 0 | ||
Preterm birth <30 wks | Any shortening of CL | 3 /454 | 70 (52–83) | 86 (82–89) | 4.8 (3.5–6.8) | 0.4 (0.2–0.6) | 0 | |
Preterm birth <28 wks | Any shortening of CL | 3 /454 | 73 (48–89) | 84 (80–88) | 4.5 (3.1–6.6) | 0.3 (0.2–0.7) | 0 |