The short-term prediction of preterm birth: a systematic review and diagnostic metaanalysis




Objective


To assess the diagnostic accuracy of fetal fibronectin (fFN), fetal breathing movements (FBM), and cervical length (CL) for the short-term prediction of preterm birth in symptomatic patients.


Study Design


Diagnostic metaanalysis using bivariate methods.


Results


Pooled sensitivities for fFN, FBM, and CL for delivery within 48 hours of testing were 0.62 (95% confidence interval [CI], 0.43–0.78), 0.75 (95% CI, 0.57–0.87) and 0.77 (95% CI, 0.54–0.90), respectively. Pooled specificities for fFN, FBM, and CL for delivery within 48 hours were 0.81 (95% CI, 0.74–0.86), 0.93 (95% CI, 0.75–0.98) and 0.88 (95% CI, 0.84–0.91). Pooled sensitivities for fFN, FBM, and CL for delivery within 7 days were 0.75 (95% CI, 0.69–0.80), 0.67 (95% CI, 0.43–0.84), and 0.74 (95% CI, 0.58–0.85). Pooled specificities for fFN, FBM, and CL for delivery within 7 days were 0.79 (95% CI, 0.76–0.83), 0.98 (95% CI, 0.83–1.00) and 0.89 (95% CI, 0.85–0.92). Based on a pretest probability of 10% for delivery within 48 hours, posttest probabilities (positive and negative) were 27% and 5% for fFN, 54% and 3% for fFN, and 42% and 3% for CL. For a pretest probability of 20% for delivery within 7 days, posttest probabilities (positive and negative) were 48% and 7% for fFN, 89% and 8% for FBM, and 63% and 7% for CL.


Conclusion


In symptomatic patients, for fFN, absence of FBM, and CL have diagnostic use as predictors of delivery within 48 hours and within 7 days of testing. Absence of FBM appears to be the best test for predicting preterm birth.


Preterm birth is the leading cause of neonatal morbidity and mortality in developed countries. Although there have been small decreases during the past 4 years, the incidence of preterm birth in the United States remains at 12%, higher than in many developed countries.


Preterm labor has emerged as one of the most common and costly obstetric indications for hospital admissions and unscheduled visits; a trend that may reflect awareness of our high rate of preterm birth. In a population-based US report, suspected preterm labor accounted for nearly 33% of the admissions to an inpatient prenatal unit. However, the diagnosis of preterm labor is subjective and unreliable and varies substantially among published studies. In a metaanalysis assessing the diagnostic accuracy of cervicovaginal fetal fibronectin (fFN), the prevalence of preterm delivery within 7 days of testing ranged from 1.8% to 29.7% with a median prevalence of 8%. A New York City hospital reported that 80% of women admitted for preterm labor required no treatment. In a multicenter US trial, only 3% of women with suspected preterm labor who were included in studies of fFN testing delivered within 7 days. Randomized clinical trials involving tocolytic therapy found that nearly 50% of the study population delivered at term. Hospital admissions for patients who are not in true preterm labor often result in the use of unnecessary and potentially harmful medications including tocolytics, corticosteroids, and antibiotics. In addition, hospitalization is expensive, an important source of stress, disruptive for the family, and often has a negative psychologic impact.


Cervicovaginal fFN, fetal breathing movements (FBM), and transvaginal sonographic cervical length (CL) measurements (TVS) represent readily available and well-tolerated testing modalities. Although several systematic reviews have assessed fFN, FBM, and TVS for the short-term prediction of preterm birth in women with signs and symptoms of preterm labor, we are not aware of a comprehensive systematic review comparing the accuracy of the 3 techniques. Although a recently published systematic review collated information about numerous tests potentially useful in the short-term prediction of preterm birth in women with signs and symptoms of preterm labor, this publication did not assess the diagnostic accuracy of these tests in depth. There is a need to rationalize the use of testing to predict preterm birth in women with signs and symptoms of preterm labor, and clear summaries of evidence can assist in the generation of practical guidance.


The objective of this systematic review is to provide a comprehensive and up-to-date overview of diagnostic accuracy of cervicovaginal fFN, absence of FBMs, and TVS CL measurement for the short-term prediction of delivery in patients with signs and symptoms of preterm labor. The selected outcomes of delivery within 48 hours and/or within 7 days were chosen as the reference standard.


Materials and Methods


The proposed systematic review with metaanalysis was preceded by a detailed study protocol stating the question to be addressed, the subgroups of interest, the methods, and criteria to be used for identifying and selecting relevant studies and extracting and analyzing information. This systematic review and metaanalysis was conducted according to the Metaanalysis of Observational Studies in Epidemiology (MOOSE) guidelines. We also followed guidelines for metaanalyses and systematic reviews evaluating screening and diagnostic tests and adopted similar standards for the reporting of accuracy studies (STARD).


Using available computerized databases, references of published systematic reviews, syllabi from scientific meetings, and chapters from textbooks, we identified studies that evaluated the diagnostic accuracy of cervicovaginal fFN, absence of FBM, and TVS measurement of CL for the short-term prediction of preterm birth in symptomatic women. We included only retrospective or prospective cohort studies that reported rates of preterm delivery within 48 hours and within 7 days of testing. The computerized databases searched were: MEDLINE (national Library of Medicine, Bethesda, MD), EMBASE (Elsevier Science, New York, NY), Current Contents (Institute for Scientific Information, Philadelphia, PA), Silver Platter (Silver Platter Information Inc, Norwood MA), and the Cochrane Library (Cochrane). The searches were conducted for literature published between 1966 and April 2013. Non-English language articles were included and translated before data abstraction. Abstracts from relevant scientific meetings that took place since 1981 were hand searched for unpublished studies or abstracts. We crossreferenced articles obtained from the database search and contacted experts in the field to uncover additional unpublished articles or abstracts. If a particular patient population was reported in more than 1 publication, we selected the article that provided the most complete data set. Medical subject heading search terms included “diagnostic accuracy,” “preterm delivery,” “sensitivity and specificity,” “likelihood ratios,” “sonography,” “ultrasound,” “fetal fibronectin,” “fetal breathing movements,” and “cervical length.”


Studies were selected for review if they included symptomatic pregnant women who after spontaneous onset of labor underwent either cervicovaginal fFN testing, assessment of FBM, or TVS measurement of CL before 37 weeks’ gestation. Studies evaluating the performance of cervicovaginal fFN were selected if they included known gestational ages after spontaneous labor and delivery and used preterm delivery within 48 hours and/or within 7 days as the reference standard. Studies describing the presence or absence of FBM were selected if they examined the accuracy of the absence of FBM in short-term prediction of spontaneous preterm birth. The reference standard was delivery within 48 hours and/or within 7 days in women with threatened preterm labor, intact fetal membranes, without antepartum bleeding, and known gestation at spontaneous birth. FBM, in general, were defined as present if sustained for at least 20 seconds and absent if no sustained movements were detected over an observation period of at least 30 minutes. Studies evaluating the diagnostic accuracy of TVS were selected if they assessed the short-term prediction of preterm birth. The reference standard was delivery within 48 hours and/or within 7 days in women with intact membranes and the spontaneous onset of labor for a single or various cutoff values of CL. Studies that included data from singleton and multifetal pregnancies were analyzed separately and double counting was avoided whenever possible. Studies included for review were selected by 2 of the authors (A.B. and L.S.R.) after scrutinizing the electronic searches and obtaining manuscripts of all citations that met the inclusion criteria cited above. Discrepancies were discussed with a third author (A.M.K.) and resolved by consensus. The main outcomes assessed and the data abstraction forms were finalized before analysis.


Study characteristics, quality of study design, and accuracy of results were obtained from each of the selected studies. To assess methodologic quality, we adapted criteria from a tool developed for the quality assessment of studies of diagnostic accuracy included in systematic reviews (QUADAS). The revised criteria include 14 items covering several dimensions of study quality: patient spectrum, reference standard, verification bias, review bias, clinical review bias, incorporation bias, masking, test execution, study withdrawals, and indeterminate results. Studies were of high quality when at least 12 items were met; moderate when at least 10 items were met and low for the remainder (<10).


We abstracted data for women with signs and symptoms of preterm labor and spontaneous preterm delivery within 48 hours and/or 7 days of testing with each modality. Statistical analyses were performed according to current recommendations. Accuracy data were used to construct 2 × 2 contingency tables of fFN, FBM, and TVS CL results and the diagnosis of preterm delivery within 48 hours and within 7 days of testing. The true positive, false positive, true negative, and false negative values were abstracted and recorded. True positives included those that delivered within 48 hours and/or within 7 days with a positive fFN, absent FBM, or a short CL. In some cases, these data were not provided in the original publications and had to be calculated from the raw data or data obtained after contacting the authors.


Study heterogeneity was assessed in several ways. We used paired forest plots of sensitivity and specificity to represent individual studies’ estimates along with their precision, represented by their exact 95% confidence intervals (CIs). As a potential cause of heterogeneity in sensitivity and specificity among the included studies, threshold/cut off effect was tested with the Spearman correlation coefficient between the logit of sensitivity and logit of 1-specificity. Heterogeneity induced by factors other than threshold/cutoff effect was evaluated by using Cochran-Q and the inconsistency index statistic ( I 2 ). For the latter, value of 0% indicates no observed heterogeneity, and values greater than 50% may be considered substantial heterogeneity. Statistical significance of heterogeneity testing was assumed when a P value was less than .10.


Studies that reported the results using different fFN assays or various cutoff values for CL were analyzed initially as a single group. We also conducted separate metaanalyses within predetermined subgroups to assess whether diagnostic accuracy varied according to type of fFN assay and the CL cutoff values used.


Pooled measures of sensitivity and specificity were determined after first estimating the underlying summary receiver operating characteristic (SROC) curve using a bivariate random effects model and then producing averaged estimations of diagnostic accuracy indices along with their corresponding 95% confidence ellipses. This model acknowledges the difference in precision by which sensitivity and specificity have been measured in each study owing to the differences in the number of women with or without preterm deliveries. We aggregated accuracy data of TVS using the cutoff selected by each individual author even though these cutoffs varied. Because we noted a threshold effect when reviewing accuracy estimates, we then fitted an overall SROC curve. This SROC curve was summarized by the area under the curve along with its 95% CI. To provide meaningful sensitivity and specificity estimates for TVS CL, we pooled data from those studies that used a 15 mm cutoff point. Summary estimates of accuracy indices were then obtained. The diagnostic accuracy of the 3 tests was compared by introducing a dummy variable coding for the test modality into the bivariate model. This approach allowed us to check for differences in sensitivity and specificity among the 3 diagnostic modalities.


To analyze the effect of different study characteristics on diagnostic accuracy, we fit several univariate metaregression models as described by Lijmer et al. The diagnostic measure used for metaregression was the relative diagnostic odds ratio (RDOR). The variables assessed by metaregression for the fFN studies included: method of testing (enzyme-linked immunosorbent assay [ELISA] compared with others), blinding of tests results, quality of studies (scores of at least 10 points), prevalence of delivery within 7 days, country of origin (United States compared with others), year of publication (older than 2002 compared with more recent), and sample size. Variables assessed for metaregression of the TVS CL studies included: CL cutoff, blinding of study, quality of studies (scores of at least 10 points), prevalence of delivery within 7 days, prospective study and sample size. Owing to the small number of studies, metaregression was not performed for the FBM studies nor was multivariate modeling performed for any of the testing modalities.


Posttest probabilities were calculated based on 1 common pretest probability (10%) of delivery within 48 hours and (20%) for delivery within 7 days. Following current guidelines, we evaluated for small study effects potentially because of publication bias using Deek’s funnel plot asymmetry test ( P < .10 considered significant), and by visually inspecting produced funnel plots. All analyses were performed using Stata 11.0 statistical software (StataCorp, College Station, TX) and MetaDisc 1.4 ( www.hrc.es/investigacion/metadisc_en.htm ).




Results


The process of study selection revealed a total of 1371 studies ( Figure 1 ). In screening titles and abstracts we concluded that 1075 were irrelevant, leaving 296 studies for potential inclusion. After reviewing the full text, 224 studies were excluded, leaving 72 studies for inclusion (38 fFN, 10 FBM, and 24 TVS ). Several reports assessed both the diagnostic accuracy of fFN and sonographic measurement of CL.




Figure 1


Flowchart of studies included in the metaanalysis

Boots. Short-term prediction of preterm birth. Am J Obstet Gynecol 2014 .


The 38 fFN studies were published between 1995 and 2013, and the number of patients analyzed ranged from 25 to 725 (6383 women in aggregate). These publications originated from the United States (15), France (4), United Kingdom (2), Germany (2), Japan (2), Netherlands (2), Turkey (2) with single publications from Chile, Canada, South Africa, New Zealand, Italy, Ecuador, Singapore, Mexico, and Australia.


The 10 FBM studies were published between 1983 and 2001, and the number of patients analyzed ranged from 24 to 70 (391 women in aggregate). These publications originated from the United States (3), United Kingdom (3), Israel (2), with single studies from Ireland, and Poland.


The 24 TVS CL studies were published between 2005 and 2013, and the sample sizes ranged from 29 to 559 (5112 women in aggregate). The TVS originated from Italy (3), France (3), Chile (2), Greece (2), Denmark (2), Turkey (2), Netherlands (2), Spain (2) with single studies from United Kingdom, South Africa, Tunisia, Japan, India, and Mexico. Overall, authors of 9 reports were contacted and asked for additional data or clarification of their results. The QUADAS scores varied from 9 to 14 (median 12) in fFN studies, 11 to 12 (median 11) in FBM studies, and 8 to 12 (median 10) in TVS CL studies.


Heterogeneity was assessed for each of the 3 testing modalities (delivery within 48 hours and/or within 7 days). Of the 3 testing modalities, TVS, using the selected cutoff used by each author, was the only test with evidence of a significant threshold effect (correlation coefficient 0.481; P = .02) based on Spearman’s correlation coefficient.


Table 1 depicts the diagnostic accuracy of fFN, FBM and TVS CL measurements in predicting spontaneous delivery within 48 hours and within 7 days of testing. For the 48 hour analysis, data were available for 4 fFN, 4 FBM studies, and 9 TVS CL studies. For both spontaneous, delivery within 48 hours and within 7 days, metaanalysis was based on the bivariate random-effects model in the presence of significant heterogeneity. For the prediction of delivery within 48 hours, the pooled sensitivity was higher for TVS; however, the pooled specificity was higher for FBM. Both the likelihood ratio for a positive test and the diagnostic odds ratio were higher for FBM than for TVS. However, the likelihood ratios for a negative test were similar for the 2 tests. For the prediction of delivery within 7 days, data were available for 38 fFN, 7 FBM studies, and 24 TVS CL studies. The pooled sensitivity for fFN was highest and pooled sensitivity was again higher for TVS than for FBM; however, pooled specificity was higher for FBM. Both the likelihood ratio for a positive test and the diagnostic odds ratio were higher for FBM than for fFN and TVS. However, the likelihood ratio for a negative test was lowest for TVS and similar for fFN and FBM.



Table 1

Summary estimates for the short-term prediction of delivery among the tests
































































Variable Sensitivity (95% CI) Specificity (95% CI) LR+ (95% CI) LR− (95% CI) DOR (95% CI) AUC (95% CI)
Delivery within 48 hours
fFN 0.62 (0.43–0.78) 0.81 (0.74–0.86) 3.3 (2.1–5.0) 0.47 (0.29–0.76) 7 (3–17) 0.74 (0.63–0.83)
FBM 0.75 (0.57–0.87) 0.93 (0.75–0.98) 10.4 (2.8–38.4) 0.27 (0.15–0.49) 37.8 (9–164) 0.83 (0.72–0.90)
TVS CL a 0.77 (0.54–0.90) 0.88 (0.84–0.91) 6.4 (4.7–8.7) 0.26 (0.12–0.58) 24 (9–65) 0.90 (0.88–0.93)
Delivery within 7 days
fFN 0.75 (0.69–0.80) 0.79 (0.76–0.83) 3.6 (3.1–4.3) 0.31 (0.25–0.39) 11.5 (8–16) 0.84 (0.80–0.87)
FBM 0.67 (0.43–0.84) 0.98 (0.83–1.00) 31.6 (4.1–244) 0.34 (0.18–0.64) 93 (15–592) 0.91 (0.88–0.93)
TVS CL a 0.74 (0.58–0.85) 0.89 (0.85–0.92) 6.8 (5.1–9.2) 0.29 (0.17–0.49) 23 (12–46) 0.91 (0.67–0.98)

CI , confidence interval; CL , cervical length; DOR , diagnostic odds ratio; FBM , fetal breathing movements; fFN , fetal fibronectin; LR , likelihood ratio; TVS , transvaginal sonographic.

Boots. Short-term prediction of preterm birth. Am J Obstet Gynecol 2014 .

a 15 mm cutoff.



With respect to TVS, measures of diagnostic accuracy were also performed using several CL cutoffs to determine which of them best predicted delivery within 7 days of testing ( Table 2 ). Using a cutoff measurement of ≤15 mm appeared most accurate in predicting spontaneous delivery within 7 days and was associated with a sensitivity and specificity of 0.74 (95% CI, 0.58–0.85) and 0.89 (95% CI, 0.85–0.92), respectively. Increasing CL cutoff was positively correlated with sensitivity and negatively correlated with specificity.



Table 2

Diagnostic accuracy of delivery within 7 days based on various TVS CL cutoffs




















































































Cutoff Studies, n Sensitivity (95% CI) Specificity (95% CI) LR+ (95% CI) LR− (95% CI) DOR (95% CI)
5 mm 2 0.35 (0.23–0.53) 0.99 (0.98–1.00) 21 (4–118) 0.72 (0.54–0.97) 29 (4–217)
10 mm 1 0.65 (0.49–0.79) 0.96 (0.94–0.97) 15.2 (9.4–24.6) 0.36 (0.24–0.55) 42 (19–90)
15 mm 15 0.74 (0.58–0.85) 0.89 (0.85–0.92) 6.8 (5.1–9.2) 0.29 (0.17–0.49) 23.4 (11.9–45.9)
20 mm 3 0.84 (0.48–0.97) 0.82 (0.60–0.93) 4.5 (2.2–9.4) 0.20 (0.06–0.73) 22 (7–76)
25 mm 9 0.78 (0.66–0.86) 0.70 (0.56–0.81) 2.6 (1.7–3.9) 0.32 (0.20–0.50) 8.1 (3.8–17)
30 mm 5 0.95 (0.90–0.97) 0.46 (0.42–0.50) 1.8 (1.6–1.9) 0.11 (0.06–0.21) 15.8 (8.3–30.1)
35 mm 1 1.00 (0.72–1.00) 0.36 (0.26–0.48) 1.5 (1.2–1.9) 0.11 (0.01–1.8) 13 (1–232)
All 37 0.79 (0.70–0.86) 0.82 (0.75–0.87) 4.4 (3.2–6.0) 0.25 (0.18–0.35) 17 (11.6–26)
Selected 24 0.81 (0.70–0.89) 0.82 (0.72–0.88) 4.4 (3.0–6.5) 0.23 (0.15–0.37) 19 (11–33)

All , all reported cutoffs; CI , confidence interval; CL , cervical length; DOR , diagnostic odds ratio; LR , likelihood ratio; Selected , cutoffs selected by the authors of each study; TVS , transvaginal sonographic.

Boots. Short-term prediction of preterm birth. Am J Obstet Gynecol 2014 .


The comparisons of sensitivity and specificity amongst the 3 tests showed no statistically significant differences in predicting delivery within 48 hours and within 7 days. However, FBM at 48 hours showed statistically significant better specificity than fFN and TVS ( P = .024 and P = .046 for fFN and TVS, respectively). Similarly, the specificity of FBM at 7 days was statistically greater than the other 2 modalities ( P = .0000 and P = .0005 for fFN and TVS, respectively).


We plotted hierarchical SROC curves to graphically present the results of all 3 testing modalities. Figure 2 , A-D, representing fFN, FBM, and TVS (select cutoffs and ≤15 mm), shows the ROC place with the fitted SROC curve and summary accuracy estimates along with the corresponding 95% confidence prediction ellipses. In hierarchical SROC curves, the index test’s sensitivity (true positive rate) was plotted on the y axis against 1-specificity (false negative rate) on the x axis. In addition, the 95% confidence region and a 95% prediction region around the pooled estimates were plotted to illustrate the precision with which the pooled values were estimated (confidence ellipse of a mean) and to show the amount of between study variation (prediction ellipse; the likely range of values for a new study). The area under the curve (AUC) and 95% CI for delivery within 48 hours and within 7 days were also calculated and the results are shown in Table 1 . The results for FBM was 0.91 (95% CI, 0.88–0.93) for delivery within 7 days. There were no differences in the overall accuracy of the 3 tests.




Figure 2


HSROC curves for the testing modalities

A , HSROC curve of fFN delivery within 7 days. AUC, 0.84 and 95% CI, 0.80–0.87. B , HSROC curve of FBM delivery within 7 days. AUC, 0.91 and 95% CI, 0.88–0.93. C , HSROC curve of TVS (select cutoff by each author) delivery within 7 days. AUC, 0.88 and 95% CI, 0.70–0.96. D , HSROC curve of TVS (15 mm cutoff) delivery within 7 days. AUC, 0.91 and 95% CI, 0.67–0.98.

AUC , area under the curve; CI , confidence interval; FBM , fetal breathing movements; fFN , fetal fibronectin; HSROC , hierarchical summary receiver operator characteristic; TVS , transvaginal sonographic cervical length measurements.

Boots. Short-term prediction of preterm birth. Am J Obstet Gynecol 2014 .


Pre- and posttest probabilities for the prediction of delivery within 48 hours and/or within 7 days were reported for cervicovaginal fFN, absence of FBM and TVS CL measurement using 1 common pretest probability of delivery within 48 hours (10%) and within 7 days (20%), and are represented in Table 3 . Because of low negative LRs, presence of FBM and normal CL are associated with substantial decreases in posttest probabilities. For example, the presence of FBM is associated with a reduction in the probability of delivery within 48 hours from 10% to a posttest probability of 3%. Similarly, the absence of FBM is associated with an increase in the probability from 20% to a posttest probability of 89% for delivery within 7 days.



Table 3

Posttest-test probabilities based on common pretest probabilities



































































Variable LR+ (95% CI) LR− (95% CI) Probability before test Probability after positive test Probability after negative test
Delivery within 48 hours: pretest probability (10%)
fFN 3.3 (2.1–5.0) 0.47 (0.29–0.76) 10% 27% 5%
FBM 10.4 (2.8–38.4) 0.27 (0.15–0.49) 10% 54% 3%
TVS CL a 6.4 (4.7–8.7) 0.26 (0.12–0.58) 10% 42% 3%
Delivery within 7 days: pretest probability (20%)
fFN 3.6 (3.1–4.3) 0.31 (0.25–0.39) 20% 48% 7%
FBM 31.6 (4.1–244) 0.34 (0.18–0.64) 20% 89% 8%
TVS CL a 6.8 (5.1–9.2) 0.29 (0.17–0.49) 20% 63% 7%

CI , confidence interval; CL , cervical length; FBM , fetal breathing movements; fFN , fetal fibronectin; LR , likelihood ratio; TVS , transvaginal sonographic.

Boots. Short-term prediction of preterm birth. Am J Obstet Gynecol 2014 .

a 15 mm cutoff.



Metaregression analysis for fFN found that test accuracy was affected by prevalence of delivery within 7 days of testing, score, year of publication, and by blinding, with blinded studies observing greater test accuracy than unblinded studies. The analysis also found that studies published after 2002 were less accurate than studies published before 2002 (relative DOR, 0.37; 95% CI, 0.19–0.69). Metaregression analysis for the TVS CL studies found that test accuracy was affected by the type of study, with prospective studies finding greater test accuracy than retrospective studies (RDOR, 6.28; 95% CI, 1.65–23.92). Visual inspection of produced funnel plots for the three testing modalities suggested evidence of small study effects for fFN potentially resulting from publication bias, an observation also supported by the results of Deek’s funnel plot asymmetry test ( P = .05).




Results


The process of study selection revealed a total of 1371 studies ( Figure 1 ). In screening titles and abstracts we concluded that 1075 were irrelevant, leaving 296 studies for potential inclusion. After reviewing the full text, 224 studies were excluded, leaving 72 studies for inclusion (38 fFN, 10 FBM, and 24 TVS ). Several reports assessed both the diagnostic accuracy of fFN and sonographic measurement of CL.




Figure 1


Flowchart of studies included in the metaanalysis

Boots. Short-term prediction of preterm birth. Am J Obstet Gynecol 2014 .


The 38 fFN studies were published between 1995 and 2013, and the number of patients analyzed ranged from 25 to 725 (6383 women in aggregate). These publications originated from the United States (15), France (4), United Kingdom (2), Germany (2), Japan (2), Netherlands (2), Turkey (2) with single publications from Chile, Canada, South Africa, New Zealand, Italy, Ecuador, Singapore, Mexico, and Australia.


The 10 FBM studies were published between 1983 and 2001, and the number of patients analyzed ranged from 24 to 70 (391 women in aggregate). These publications originated from the United States (3), United Kingdom (3), Israel (2), with single studies from Ireland, and Poland.


The 24 TVS CL studies were published between 2005 and 2013, and the sample sizes ranged from 29 to 559 (5112 women in aggregate). The TVS originated from Italy (3), France (3), Chile (2), Greece (2), Denmark (2), Turkey (2), Netherlands (2), Spain (2) with single studies from United Kingdom, South Africa, Tunisia, Japan, India, and Mexico. Overall, authors of 9 reports were contacted and asked for additional data or clarification of their results. The QUADAS scores varied from 9 to 14 (median 12) in fFN studies, 11 to 12 (median 11) in FBM studies, and 8 to 12 (median 10) in TVS CL studies.


Heterogeneity was assessed for each of the 3 testing modalities (delivery within 48 hours and/or within 7 days). Of the 3 testing modalities, TVS, using the selected cutoff used by each author, was the only test with evidence of a significant threshold effect (correlation coefficient 0.481; P = .02) based on Spearman’s correlation coefficient.


Table 1 depicts the diagnostic accuracy of fFN, FBM and TVS CL measurements in predicting spontaneous delivery within 48 hours and within 7 days of testing. For the 48 hour analysis, data were available for 4 fFN, 4 FBM studies, and 9 TVS CL studies. For both spontaneous, delivery within 48 hours and within 7 days, metaanalysis was based on the bivariate random-effects model in the presence of significant heterogeneity. For the prediction of delivery within 48 hours, the pooled sensitivity was higher for TVS; however, the pooled specificity was higher for FBM. Both the likelihood ratio for a positive test and the diagnostic odds ratio were higher for FBM than for TVS. However, the likelihood ratios for a negative test were similar for the 2 tests. For the prediction of delivery within 7 days, data were available for 38 fFN, 7 FBM studies, and 24 TVS CL studies. The pooled sensitivity for fFN was highest and pooled sensitivity was again higher for TVS than for FBM; however, pooled specificity was higher for FBM. Both the likelihood ratio for a positive test and the diagnostic odds ratio were higher for FBM than for fFN and TVS. However, the likelihood ratio for a negative test was lowest for TVS and similar for fFN and FBM.


May 11, 2017 | Posted by in GYNECOLOGY | Comments Off on The short-term prediction of preterm birth: a systematic review and diagnostic metaanalysis

Full access? Get Clinical Tree

Get Clinical Tree app for offline access