Objective
We sought to identify serum markers of subsequent spontaneous preterm birth (SPTB) in asymptomatic women prior to labor.
Study design
Serum proteomics was applied to sera from 80 pregnant women sampled at 24 weeks and an additional 80 pregnant women sampled at 28 weeks. Half had uncomplicated pregnancies and half had SPTB.
Results
Three specific peptides arising from inter-alpha-trypsin inhibitor heavy chain 4 protein were significantly reduced in women at 24 and 28 weeks having subsequent SPTB. The most discriminating peptide had a sensitivity of 65.0% and specificity of 82.5%; odds ratio, 8.8; and 95% confidence interval, 3.1–24.8. A combination of the 3 new biomarkers and 6 previously studied biomarkers increased sensitivity to 86.5%, with a specificity of 80.6% at 28 weeks.
Conclusion
Three novel serum markers of SPTB have been identified using serum proteomics. Using a combination of these new markers with additional markers, women at risk of SPTB can be identified weeks prior to SPTB.
Spontaneous preterm birth (SPTB) is the leading cause of perinatal morbidity and mortality in the United States. Despite the magnitude of the problem and the substantial research efforts of many investigators, completely efficacious therapies for the treatment or prevention of SPTB have yet to be developed. Indeed, the rate of SPTB has not changed in decades. A major obstacle to the development of an effective treatment for preterm labor is a limited understanding of the molecular events required to initiate and maintain term and preterm labor.
See Journal Club, page 452
Several proteins present in maternal serum or cervical secretions have been proposed as markers that may predict SPTB. We have previously evaluated a large number of potential markers in a prospectively collected cohort and shown that a screening test consisting of 3 serum markers (corticotropin releasing hormone, alpha fetoprotein, alkaline phosphatase) and 2 cervical secretion markers (fetal fibronectin and ferritin) provided increased sensitivity, specificity, and odds ratio (OR). However, none of the current SPTB markers alone or in combination provides adequate specificity or sensitivity to be used in clinical prediction.
Recent advances in technology allow for the evaluation of a large, unbiased portion of the complement of peptides and/or proteins present in maternal serum. Serum proteomic analysis, consisting of chromatographic separation followed by mass spectrometry to identify peptides and proteins by mass, can provide an extensive inventory of peptides and/or proteins present at any given time. Previous studies have attempted to use proteomic patterns to identify patients with early ovarian, breast, and prostate cancers. The use of proteomic analysis to identify phenotypic molecular characteristics of women who experience SPTB or infection has been attempted in amniotic fluid, and cervical secretions but serum proteomic analysis has not been reported.
SPTB is well suited for a proteomic approach given likely serologic changes that precede its clinical manifestations by weeks. We hypothesize that proteomic differences exist in maternal serum several weeks prior to the onset of clinical symptoms in women destined to develop SPTB. Our aim was to use serum proteomics to differentiate women having a subsequent SPTB from those having term deliveries. Moreover, we hoped to identify all peptides that are found to be increased or decreased in the serum of women who go on to have an SPTB compared with those who deliver at term.
Materials and methods
Patient population
This study represents a nested case-control study that used samples and data that were collected during the National Institute of Child Health and Human Development Maternal-Fetal Medicine Units Network Preterm Prediction Study. The Preterm Prediction Study, conducted from 1992 through 1994, was a multicenter observational investigation of 2929 symptom-free women evaluated longitudinally to determine risk factors for SPTB. Women were enrolled in this study without regard to specific risk factors for SPTB. Extensive information and/or biologic specimens were collected at each of 4 study visits, beginning at approximately 22-24 weeks’ gestation and occurring at 2-week intervals. The overall study population and the methods used in the Preterm Prediction Study have been previously described in detail. Gestational age was based on the last menstrual period if the last menstrual period–derived gestational age was confirmed within 10 days by the earliest ultrasonographic evaluation. An SPTB was defined as a preterm birth (PTB) <35 weeks’ gestation occurring as the result of the spontaneous onset of labor or spontaneous rupture of membranes.
Serum was collected at 24 and 28 weeks’ gestation and pregnancy outcomes were obtained. Participating women provided voluntary, informed consent. The original study protocols as well as these secondary analyses were approved by the representative institutional review boards. For this study, serum from 40 subjects who experienced a subsequent SPTB and 40 subjects having uncomplicated pregnancies was obtained at 24 weeks’ gestation and submitted to proteomic analysis. Additionally, serum from a separate group of 40 subjects who experienced a subsequent SPTB and 40 subjects having uncomplicated pregnancies who ultimately delivered at term after spontaneous onset of labor was obtained at a 28 weeks’ gestation visit and was likewise analyzed. Pregnancies complicated by indicated PTBs were excluded from both cases and controls. Cases and controls were randomly selected by the Maternal-Fetal Medicine Units Network independent statistical core to produce a representative group from among the cohort. Researchers were given 2 groups of subjects for evaluation but were blinded to case or control status of these groups during the proteomic analysis and data evaluation.
Specimen preparation
Blood was collected and rapidly processed to serum using a serum separator tube. Once obtained the specimen was quickly frozen and maintained at –80°C until analysis. The specimens had not undergone repeated freeze-thaw cycles.
Given our interest in the low-molecular-weight proteome, a protein precipitation method was selected and high-molecular-weight, and typically uninformative, proteins were removed by acetonitrile precipitation following an established protocol.
In brief, specimens were thawed on ice and kept at 0°C until the initiation of precipitation. Two volumes of high performance liquid chromatography (HPLC) grade acetonitrile (400 μL) were added to 200 μL serum; the sample was vortexed vigorously and allowed to stand at room temperature for 30 minutes. Samples were centrifuged for 10 minutes at 13,400 g at room temperature. An aliquot of the supernatant (∼550 μL) was transferred to a microcentrifuge tube containing 300 μL HPLC grade water. The sample was vortexed and lyophilized to ∼200 μL in a vacuum centrifuge (Labconco CentriVap Concentrator, Labconco Corp, Kansas City, MO). There was complete removal of acetonitrile. Supernatant protein concentration was determined using a microtiter plate protein assay performed according to manufacturer instructions (Bio-Rad, Hercules, CA). An aliquot of the same supernatant containing 4 μg of apparent total protein was transferred to a new microcentrifuge tube and lyophilized to a volume <20 μL. Lyophilized samples were brought to 20 μL with HPLC water and acidified by addition of 20 μL 88% formic acid and 0.50 μg of protein was loaded on the column. If not used immediately, the protein-depleted specimen was kept at −80°C and for the capillary liquid chromatography (cLC) step the specimen was placed at 4°C until introduced onto the column.
Capillary liquid chromatography
The cLC was interfaced with a mass spectrometer, allowing for the continuous direct delivery of fractionated, protein-depleted serum to the mass detector. Additional details of the method have been published.
Electrospray-ionization, time-of-flight mass spectrometry (EOI-TOFMS)
Effluent from the cLC was directed into a QSTAR Pulsar i quadrupole orthogonal time-of-flight mass spectrometer through an IonSpray source (Applied Biosystems, Carlsbad, CA). Mass spectra were collected every second for m/z 500-2500 from 5- to 55-minute elution. Data collection and preliminary formatting were accomplished using Analyst QS software with BioAnalyst add-ons (Applied Biosystems). Specimens from cases and controls were analyzed together in random order.
To reduce data file size, each mass chromatogram was divided into 10 2-minute elution intervals. One reference peak, observable in all specimens, near the center of each interval, and that did not demonstrate differences in abundance between the 2 groups, was used to align time in that elution region. Of the 10 elution intervals, the first to be analyzed (and the only one reported here) was the second 2-minute window, chosen because more peptides were present. Almost all biological mass spectrometers have some ability to statistically compare groups of specimens for quantitative differences. However, that software was developed for comparisons of at most a few hundred species. Several attempts were made to use software to do comparisons of candidate peaks between cases and controls across all spectra. This was attempted using software from Bioanalyst (Applied Biosystems), Agilent (Agilent Technologies, Santa Clara, CA), Mass Finder (MassFinder 4, Hamburg, Germany), Nonlinear (Nonlinear Dynamics, Newcastle upon Tyne, UK), and others. All of the software experienced ≥1 of the following problems: artifactual peaks, artifactual peak differences, or loss of >50% of data. The problems appeared to be due to the observation of 4000-5000 peaks with many having very close m/z values, partial overlap of peaks, and many multiply charged ion envelops. Consequently, the initial review of mass spectra was accomplished by overlaying 2-minute summary spectra from cases and controls distinguished by color and visual review.
Evaluation of candidate markers
The candidate markers were not defined a priori but were identified only after visual inspection of the actual spectra generated in both cases and controls. Each candidate marker, appearing quantitatively different between groups, was further evaluated. See Figure 1 for an example of actual mass spectra representing apparent differences between cases and controls for a candidate marker. In addition, to reduce nonbiological variation, a second peak was also chosen. To be considered the reference peak, the peak had to elute in the same time interval, be present in each specimen, and have a mass-to-charge ratio very near the candidate peak, but the reference peak had to be consistently quantitatively comparable between cases and controls as determined initially by visual inspection and then by machine software extraction. This reference peak was then used to normalize the candidate peak of interest, correcting for variability in specimen processing, specimen loading, ionization efficiency, and instrument performance and allowing comparison across runs performed on different days. Thereafter, the candidate markers were “extracted” by the Analyst software to determine a quantitative peak height of both candidate and reference peaks in each specimen for all subjects.
Mass spectral data analysis
The abundance (peak height) of each candidate and its relevant reference peak was quantified (extracted) by the instrument’s software and tabulated as was the calculated ratio of each candidate marker abundance relative to the abundance of the reference within each patient. The log of that ratio was also determined because abundance varied substantially. The data were submitted to statistical analysis. Attempts at statistical analysis of data without visual preselection failed both when using the instrument software and when using other software for the analysis. This was likely due to the large number of peaks creating many overlapping envelopes, in particular the presence of multiply charged ions. Problems included overlooking 80% of data in some analyses and in generating artifactual differences that did not exist when single peak analyses were carried out.
MS-MS amino acid sequencing
Candidate markers demonstrating statistically different abundance between cases and controls were further analyzed in an effort to chemically identify the candidate molecule. The approach employed here has been described in detail elsewhere. Frozen supernatant (from the protein reduction step) was thawed and injected for tandem mass spectrometry (MS-MS) analysis of the ion of interest. The selected candidate marker was fragmented using nitrogen gas collision and the daughter fragments were recorded in the second MS detector. This collision fragment peak list was submitted to Mascot ( www.matrixscience.com ; Matrix Science, Boston, MA), a searchable MS database allowing protein/peptide identification. Amino acid sequences were also independently submitted to the short homologous or near homologous protein BLAST search available through the National Center for Biotechnology Information World Wide Web site as a confirmation.
Previous biomarker analysis
Plasma corticotropin-releasing factor, defensin, ferritin, lactoferrin, thrombin antithrombin complex, and tumor necrosis factor-α receptor type 1 assays have been previously analyzed by immunoassays and reported. Most of these assays were research assays developed in the laboratory of the relevant investigator. The results from those previous assays were reevaluated statistically for the subjects included in this study.
Statistical analysis
Data are expressed as means ± 1 SD for demographic measures and means ± 1 SE for biomarkers. Species that appeared to be quantitatively different were considered. Only 4 species were numerically and statistically evaluated. Comparisons of the abundance of a single species for the 2 study populations were carried out by the Wilcoxon rank sum test. Fisher’s exact test was used for categorical analysis. Comparisons were carried out for each candidate at both 24 and 28 weeks’ gestation. No corrections were made for multiple comparisons. In all, 23 additional serum markers were previously assayed in these samples as part of other studies, and comparisons of the abundance of each individual marker were calculated using Wilcoxon rank sum test for the subset of subjects considered in this study. Logistic regression analyses were performed for the 3 novel biomarkers in combination with the best 6 of the previously tested markers. The combination was used for classification performance by means of receiver operator curves. For all statistical tests, nominal 2-sided P values are reported with statistical significance defined as a P value < .05. Software (SAS 8.2; SAS Institute, Cary, NC) was used for these analyses.
In calculating the sensitivity and specificity of the 3 peptide biomarkers, a threshold was established. The requirement was that the numeric threshold chosen provided at least 80% sensitivity. These thresholds at the 28-week gestation sampling, using the log ratio (biomarker/reference) data, were as follows: biomarker peptide m/z 677 = 0.00, biomarker peptide m/z 857 = −0.347, and biomarker peptide m/z = −0.222.