Background
Preterm birth (PTB) (< 37 completed weeks’ gestation) is a pathological outcome of pregnancy and a major global health problem. Babies born preterm have an elevated risk for long-term adverse medical and neurodevelopmental sequelae. Substantial evidence implicates intrauterine infection and/or inflammation in PTB. However, these are often relatively late findings in the process, when PTB is inevitable. Identification of earlier markers of PTB may make successful intervention possible. Although select proteins, notably those related to the inflammatory pathways, have been associated with PTB, there has been a lack of research into the role of other protein pathways in the development of PTB. The purpose of this study was to investigate, using a previously described biomarker discovery approach, a subset of circulating proteins and their association with PTB focusing on samples from early pregnancy.
Objectives
The objectives of the study were as follows: (1) to perform a large-scale biomarker discovery, utilizing an innovative platform to identify proteins associated with preterm birth in plasma taken between 10 and 15 weeks’ gestation and, (2) to determine which protein pathways are most strongly associated with preterm birth. To address these aims, we measured 1129 proteins in a plasma sample from early pregnancy using a multiplexed aptamer–based proteomic technology developed in Colorado by SomaLogic.
Study Design
Using a nested case-control approach, we measured proteins at a single time point in early pregnancy in 41 women who subsequently delivered preterm and 88 women who had term uncomplicated deliveries. We measured 1129 proteins using a multiplexed aptamer–based proteomic technology developed by SomaLogic. Logistic regressions and random forests were used to compare protein levels.
Results
The complement factors B and H and the coagulation factors IX and IX ab were the highest-ranking proteins distinguishing cases of preterm birth from term controls. The top 3 pathways associated with preterm birth were the complement cascade, the immune system, and the clotting cascade.
Conclusion
Using a discovery approach, these data provide further confirmation that there is an association of immune- and coagulation-related events in early pregnancy with preterm birth. Thus, plasma protein profiles at 10–15 weeks of gestation are related to the development of preterm birth later in pregnancy.
Preterm birth (PTB) (< 37 completed weeks’ gestation) is a pathological outcome of pregnancy and a major global health problem. Babies born preterm have an elevated risk for long-term adverse medical and neurodevelopmental sequelae. The pathological mechanisms leading to PTB are complex with multiple pathways involved.
Substantial evidence implicates intrauterine infection and/or inflammation in PTB. However, these are often relatively late findings in the process, when PTB is inevitable. Identification of earlier markers of PTB may make successful intervention possible. Although select proteins, notably those related to the inflammatory pathways, have been associated with PTB, there has been a lack of research into the role of other protein pathways in the development of PTB.
The purpose of this study was to investigate, using a previously described biomarker discovery approach, a subset of circulating proteins and their association with PTB focusing on samples from early pregnancy. The goals of this study, using a broad-based proteomic screening platform were as follows: (1) to identify top-ranked proteins associated with PTB in early pregnancy (10–15 weeks’ gestation) and (2) to determine the top-ranked pathways associated with PTB. To address these aims, we measured 1129 proteins in a plasma sample from early pregnancy using a multiplexed aptamer-based proteomic technology developed in Colorado by SomaLogic Inc (Boulder, CO).
Material and Methods
This is an analysis of data and samples collected as part of the Denver Complement Study. This prospective cohort study was approved by the Colorado Multiple Institutional Review Board. In brief, 1287 women were recruited in the first half of pregnancy from the University of Colorado Hospital prenatal clinics and 2 affiliated sites. Informed consent was obtained, and an additional EDTA-plasma tube was obtained with the routine prenatal laboratory tests. Data were gathered on the maternal medical and obstetrical history. After delivery, outcome data were collected, and the gestational age at blood draw was assigned based on the best overall obstetrical estimate, incorporating assessment at the first visit and, in the great majority, on early ultrasound examination.
We conducted a nested-case control study within this longitudinal cohort. We made an a priori decision to concentrate the analysis on women who had a blood sample taken between 10 and 15 weeks’ gestation (n = 856) to reduce the potential confounding effect of gestational age at blood draw and to consider early predictors of PTB. Only the first delivery of women who had more than 1 delivery during the study period was included (6 records deleted) in the analysis.
The following records were also removed: multiple births (n = 34), congenital or chromosomal anomalies or deliveries less than 20 weeks’ gestation (n = 24), loss to follow-up (n = 32), chronic medical disease (cardiac disease, chronic hypertension, coagulation disorders, uterine anomalies, type 1 diabetes, and autoimmune disease, n = 111), consent not provided to research outside of the primary study (n = 22), and deviation in study protocol (missing or inadequate sample, n = 24).
Following these exclusions, 603 records remained in the analytic data set (41 cases of PTB and 562 term controls). Because we were interested in the pathogenic mechanisms of PTB, we excluded from the term deliveries records of patients with preeclampsia, gestational hypertension, disorders of placentation, intrauterine growth restriction, and induction of labor or cesarean delivery (n = 474). Therefore, all term controls had an uncomplicated spontaneous vaginal delivery. The exclusion of these comorbidities allowed a more systematic comparison with PTB, thereby increasing power to find differences attributed to PTB without being obscured by additional complications. Ultimately, 41 cases of PTB and 88 term controls were included in the analytic data set.
The primary outcome was PTB (between 20 and less than 37 completed weeks of gestation) resulting from spontaneous PTB (spontaneous preterm labor or preterm premature rupture of the membranes [n = 22]) or from a medical indication (n = 19). Ten of the medically indicated PTB resulted from hypertensive diseases of pregnancy, 5 from placental problems, and 4 from intrauterine growth restriction.
Sample preparation
Each sample was centrifuged within an average time of 9 minutes from phlebotomy (median, 6 minutes, range, 1–38 minutes). The supernatant was removed, aliquoted, placed on dry ice, and placed in a freezer at –80 o C. All samples had 1 freeze thaw prior to proteomic analysis in this study.
SomaLogic technology
The SOMAscan proteomic assay is supported by a new generation of protein-capture slow off-rate modified aptamer (SOMAmer) reagents. SOMAmers bind to preselected targets including proteins and peptides with high affinity and specificity. The SOMAscan multiplex assay consists of 1129 individual affinity molecules SOMAmer reagents (described in detail elsewhere ). In brief, a biological sample in each well of a 96-well plate is incubated with a mixture of the 1129 SOMAmer reagents. Two sequential bead-based immobilization and washing steps eliminate unbound or nonspecifically bound proteins and the unbound SOMAmer reagents, leaving only protein target-bound SOMAmer reagents. These remaining SOMAmer reagents are isolated, and each reagent is quantified simultaneously on a custom Agilent hybridization array (Agilent, Santa Clara, CA). The amount of each SOMAmer measured is quantitatively proportional to the protein concentration in the original sample.
Statistical analysis
Descriptive statistics were calculated and compared across groups using t tests or χ 2 tests. Concentrations for each of 1129 proteins were log (base 2) transformed and compared between PTB and term groups using a logistic regression. P values were adjusted for multiple comparisons using the false discovery rate. Proteins that were significant in this analysis were evaluated across the 3 subgroups (term, medically indicated, and spontaneous PTB) using an analysis of variance. Random forests consisting of 5000 classification trees were used to multivariately evaluate proteins, the mean decrease in the Gini index was used to rank proteins, and the out-of-bag error rate was used to describe classification error. Proteins were classified into pathways using Reactome and were ranked using the Fisher combined probability test, which assesses the association for the pathway by combining test statistics from the individual proteins contained within that pathway.
Results
Baseline statistics on select variables (maternal age, race, parity, and gestational age at the first prenatal visit) from the 41 cases of preterm birth (22 spontaneous and 19 medically indicated) and 88 uncomplicated term deliveries from the nested case control study show comparable demographics between groups, with the exception of race ( Table 1 ).
Characteristics | Term deliveries (n = 88) | All PTB (n = 41) | P value |
---|---|---|---|
Mean maternal age ± SD, y | 34.2 (4.0) | 33.7 (7.2) | .60 |
Mean gestational age at blood draw ± SD, wks | 12.4 (1.0) | 12.3 (1.1) | .62 |
Race/ethnicity | |||
Non-Hispanic white | 78 (89%) | 24 (59%) | |
Hispanic | 7 (8%) | 12 (29%) | |
African American | 1 (1%) | 3 (7%) | |
Asian and other race/ethnicities | 2 (2%) | 2 (5%) | < .01 |
Nulliparous | 43 (49%) | 16 (39%) | .30 |
The SOMAmer proteins were compared in women with PTB and term delivery. The significance (-log 10 P value) and the magnitude (odds ratio [OR]) for all the SOMAmer proteins are displayed graphically ( Figure 1 ). Proteins with the largest magnitude of difference between groups and smallest P value are named in the figure and appear in the upper-left and upper-right-hand corners of the plot. Several proteins related to the clotting (coagulation factors IX and IX ab) and complement pathways (factor B, factor H, and ficolin-3) were higher in PTB cases compared with controls.
In addition, we found links between platelet endothelial cell adhesion molecule-1 (PECAM-1), serum amyloid P-component (SAP), vascular endothelial growth factor receptor-2 (VEGF SR2), cathepsin Z, and growth hormone receptor (GHR) with PTB. Other proteins such as protein jagged-1 (JAG1), matrix metalloproteinase-2 (MMP-2), neurogenic locus notch homolog protein 3 (Notch 3), and angiopoietin-2 were lower in the cases compared with the controls.
We demonstrate in Table 2 the magnitude of the association (OR) with 95% confidence intervals of the relationship of the top-ranked proteins (n = 34) with PTB along with the unadjusted P value and the P value adjusted for multiple comparisons. The coagulation factors IX, IX ab, and SAP remained significantly related to PTB following adjustment for multiple comparisons, as did leptin. Factor B, GHR, and insulin-like growth factor-binding protein-1 (IGFBP)-1 were of borderline significance after multiple comparisons adjustment.
UniProt c | Target | OR | 95% CI | Unadjusted P value | FDR P value | |
---|---|---|---|---|---|---|
P00740 | Coagulation factor IX | 280.7 | 17.04 | 4623 | .00 | .05 |
P00740 | Coagulation factor IX ab | 158.4 | 11.41 | 2201 | .00 | .05 |
P00751 | Factor B | 40.66 | 5.59 | 295.8 | .00 | .06 |
P16284 | PECAM-1 | 40.41 | 2.89 | 564.7 | .01 | .29 |
P08603 | Factor H | 26.16 | 2.96 | 231.1 | .00 | .26 |
P02743 | SAP | 17.81 | 4.08 | 77.78 | .00 | .05 |
P35968 | VEGF sR2 | 10.51 | 1.81 | 61.01 | .01 | .31 |
Q9UBR2 | CATZ | 6.61 | 1.76 | 24.84 | .01 | .29 |
P10912 | GHR | 6.42 | 2.23 | 18.47 | .00 | .09 |
O75636 | Ficolin-3 | 5.64 | 1.63 | 19.52 | .01 | .29 |
Q8TE58 | ATS15 | 4.93 | 1.89 | 12.86 | .00 | .15 |
P22223 | P-cadherin | 4.80 | 1.57 | 14.73 | .01 | .29 |
P36507 | MP2K2 | 3.82 | 1.39 | 10.52 | .01 | .32 |
P10619 | Cathepsin A | 3.47 | 1.55 | 7.77 | .00 | .23 |
P41159 | Leptin | 3.30 | 1.83 | 5.93 | .00 | .05 |
P01031 | C5a | 3.06 | 1.43 | 6.53 | .00 | .28 |
P01833 | PIGR | 2.98 | 1.44 | 6.17 | .00 | .26 |
P63098 | Calcineurin Ba | 2.94 | 1.36 | 6.37 | .01 | .29 |
O00408 | cGMP-stimulated PDE | 2.77 | 1.29 | 5.95 | .01 | .31 |
O60603 | TLR2 | 2.65 | 1.30 | 5.41 | .01 | .29 |
Q9UK85 | Soggy-1 | 2.46 | 1.35 | 4.50 | .00 | .26 |
P07949 | RET | 2.44 | 1.43 | 4.19 | .00 | .15 |
P78504 | JAG1 | 0.05 | 0.01 | 0.46 | .01 | .31 |
P08253 | MMP-2 | 0.06 | 0.01 | 0.50 | .01 | .31 |
Q9UM47 | Notch-3 | 0.12 | 0.03 | 0.55 | .01 | .29 |
O15123 | Angiopoietin-2 | 0.29 | 0.13 | 0.64 | .00 | .23 |
Q16644 | MAPKAPK3 | 0.33 | 0.14 | 0.73 | .01 | .29 |
Q13219 | PAPP-A | 0.39 | 0.21 | 0.75 | .00 | .29 |
P02751 | FN1.3 | 0.40 | 0.21 | 0.76 | .01 | .29 |
P18065 | IGFBP-2 | 0.40 | 0.21 | 0.78 | .01 | .29 |
P08833 | IGFBP-1 | 0.41 | 0.25 | 0.66 | .00 | .06 |
P02751 | Fibronectin | 0.41 | 0.23 | 0.72 | .00 | .20 |
P45985 | MP2K4 | 0.41 | 0.21 | 0.79 | .01 | .30 |
P02790 | Hemopexin | 0.47 | 0.27 | 0.80 | .01 | .29 |
a This table includes only the proteins significantly related to PTB from the univariate analysis
b Adjusted for multiple comparisons
c UniProt id the universal protein resource (UniProt). Protein-specific identification is from a well-curated, centralized, and freely accessible database.
The results from the random forest reiterate those from the univariate logistic regressions; the coagulation factors IX and IXab, factor B, and factor H were the highest-ranking factors distinguishing cases of PTB from term controls ( Figure 2 ). The random forest resulted in a classification error rate (ability of the random forest to accurately classify women into PTB/full term based only on their protein profiles) of 31%, which was reduced to 26.4% (17% for full term and 46% for PTB) after dimension reduction.
We show in Table 3 the pathways that are potentially involved in PTB. Proteins can be classified into multiple pathways and pathways are hierarchical, resulting in pathways that are not mutually exclusive. Smaller, more specific pathways are nested into larger, more general pathways (please see legend for Table 3 ). In this interactive context, the complement system ranked highest, followed by the immune system and the clotting cascade.