Background
Current cell-free DNA assessment of fetal chromosomes does not analyze and report on all chromosomes. Hence, a significant proportion of fetal chromosomal abnormalities are not detectable by current noninvasive methods. Here we report the clinical validation of a novel noninvasive prenatal test (NIPT) designed to detect genomewide gains and losses of chromosomal material ≥7 Mb and losses associated with specific deletions <7 Mb.
Objective
The objective of this study is to provide a clinical validation of the sensitivity and specificity of a novel NIPT for detection of genomewide abnormalities.
Study Design
This retrospective, blinded study included maternal plasma collected from 1222 study subjects with pregnancies at increased risk for fetal chromosomal abnormalities that were assessed for trisomy 21 (T21), trisomy 18 (T18), trisomy 13 (T13), sex chromosome aneuploidies (SCAs), fetal sex, genomewide copy number variants (CNVs) ≥7 Mb, and select deletions <7 Mb. Performance was assessed by comparing test results with findings from G-band karyotyping, microarray data, or high coverage sequencing.
Results
Clinical sensitivity within this study was determined to be 100% for T21 (95% confidence interval [CI], 94.6–100%), T18 (95% CI, 84.4–100%), T13 (95% CI, 74.7–100%), and SCAs (95% CI, 84–100%), and 97.7% for genomewide CNVs (95% CI, 86.2–99.9%). Clinical specificity within this study was determined to be 100% for T21 (95% CI, 99.6–100%), T18 (95% CI, 99.6–100%), and T13 (95% CI, 99.6–100%), and 99.9% for SCAs and CNVs (95% CI, 99.4–100% for both). Fetal sex classification had an accuracy of 99.6% (95% CI, 98.9–99.8%).
Conclusion
This study has demonstrated that genomewide NIPT for fetal chromosomal abnormalities can provide high resolution, sensitive, and specific detection of a wide range of subchromosomal and whole chromosomal abnormalities that were previously only detectable by invasive karyotype analysis. In some instances, this NIPT also provided additional clarification about the origin of genetic material that had not been identified by invasive karyotype analysis.
Introduction
Since its introduction in 2011, noninvasive prenatal tests (NIPT) have had a significant impact on prenatal care. In only 4 years, NIPT has evolved into a standard option for high-risk pregnancies. Content has also evolved from exclusive trisomy 21 (T21) testing to include trisomy 18 (T18), trisomy 13 (T13), sex chromosome aneuploidies (SCAs), and select microdeletions. This standard content can be expected to detect 80-83% of chromosomal abnormalities detected by karyotyping in a general screening population, however this leaves a gap of approximately 17-20% of alternative chromosomal/subchromosomal abnormalities not detected. Consequently, obtaining comprehensive information about the genetic makeup of the fetus requires an invasive procedure. To overcome these limitations, NIPT could be extended to cover the entire genome. However, it is challenging to maintain a very high specificity and positive predictive value (PPV) when interrogating all accessible regions in the genome. In previous reports, we have overcome these technical hurdles. Furthermore, a recent study by Yin et al demonstrated feasibility for noninvasive genomewide detection of subchromosomal abnormalities. In this report, we have improved the assay and statistical methods to enable comprehensive genomewide detection of copy number variants (CNVs) ≥7 Mb. We present results of a large blinded clinical study of >1200 samples including >100 samples with common aneuploidies detectable by traditional NIPT and >30 samples affected by subchromosomal CNVs.
Materials and Methods
Study design
This blinded, retrospective clinical study included samples from women considered at increased risk for fetal aneuploidy based on advanced maternal age ≥35 years, a positive serum screen, an abnormal ultrasound finding, and/or a history of aneuploidy. Archived samples were selected for inclusion in the study by an unblinded internal third party according to the requirements documented in the study plan. Samples were then blind-coded to all operators and the analysts who processed the samples. After sequencing, an automated bioinformatics analysis was performed to detect whole chromosome aneuploidies and subchromosomal CNVs. Results were compiled electronically and were reviewed by a subject matter expert who assigned the final classification. This manual review mimics the process in the clinical laboratory, where cases are reviewed by a laboratory director before a result is signed out. Completed classification results were provided to the internal third party for determination of concordance. Analyzed samples had confirmation of positive or negative events by either G-band karyotype or microarray findings from samples collected through either chorionic villus sampling (CVS) or amniocentesis. Circulating cell-free “fetal” DNA is believed to originate largely from placental trophoblasts. Genetic differences between the fetus and the placenta can occur (eg, confined placental mosaicism), leading to discordance between NIPT results and cytogenetic studies on amniocytes or postnatally obtained samples. Results from CVS by chromosomal microarray were thus considered the most accurate ground truth. Therefore, discordant results originating from amniocytes (karyotype or microarray) were resolved by sequencing at high coverage (an average of 226 million reads per sample). Sequencing depth has been shown as the limiting factor in NIPT methods, with increased depth allowing improved detection of events in samples with lower fetal fractions or improved detection of smaller events. High coverage sequencing has been used in multiple studies to unambiguously identify subchromosomal events and was used here as a reference for performance evaluation in discrepant amniocentesis samples.
Details of the sample demographics are described in Table 1 . Indications for invasive testing are described in Table 2 .
Demographic | Median | Range |
---|---|---|
Maternal age, y (Ntotal = 1177) | 36.0 | 17.8–47 |
Gestational age, wk (Ntotal = 1183) | 17 | 8–38 |
Maternal weight, lb (Ntotal = 1168) | 150 | 93–366 |
Procedure | Percent | (Naffected/Ntotal) |
CVS | 14.5 | (175/1203) |
Amniocentesis | 85.2 | (1025/1203) |
Both | 0.2 | (3/1203) |
Confirmation | Percent | (Naffected/Ntotal) |
Karyotype | 90.4 | (1089/1205) |
Microarray | 5.8 | (70/1205) |
Both | 3.8 | (46/1205) |
Indication | Euploid, n = 1009 | T21, n = 87 | T18, n = 29 | T13, n = 15 | SCA, n = 26 | CNV, n = 44 | Total, n = 1210 |
---|---|---|---|---|---|---|---|
Positive serum screening | 404 (40%) | 36 (41.4%) | 13 (44.8%) | 4 (26.7%) | 3 (11.5%) | 16 (36.4%) | 476 (39.3%) |
Maternal age >35 y | 418 (41.4%) | 28 (32.2%) | 10 (34.5%) | 4 (26.7%) | 2 (7.7%) | 10 (22.7%) | 472 (39%) |
Ultrasound abnormality | 152 (15.1%) | 34 (39.1%) | 6 (20.7%) | 10 (66.7%) | 16 (61.5%) | 22 (50%) | 239 (19.8%) |
Family history | 33 (3.3%) | 4 (4.6%) | 1 (3.4%) | 0 (0%) | 1 (3.8%) | 1 (2.3%) | 40 (3.3%) |
Not specified | 85 (8.4%) | 11 (12.6%) | 5 (17.2%) | 2 (13.3%) | 4 (15.4%) | 2 (4.5%) | 109 (9%) |
Sample collection
In total, 1222 maternal plasma samples were previously collected using 4 investigational review board (IRB)-approved protocols with a small subset (9 samples) comprising remnant plasma samples collected from previously consented patients in accordance with the Food and Drug Administration guidance on informed consent for in vitro diagnostic devices using leftover human specimens that are not individually identifiable. Samples from 2 of the protocols (Compass IRB no. 00508 and Western IRB no. 20120148) were collected from high-risk pregnant subjects prior to undergoing a confirmatory invasive procedure (1189 samples). Samples from the other 2 protocols (Compass IRB no. 00351 and Columbia University IRB no. AAAN9002) came from subjects who were enrolled in the studies after receiving the fetal karyotype and/or microarray results of a confirmatory invasive procedure (24 samples). All subjects provided written informed consent prior to undergoing any study-related procedures. A total of 5321 high-risk subjects were recruited into the 4 clinical studies indicated above at the time of sample selection. To be eligible for inclusion into this study, subjects had to have met all protocol inclusion and no exclusion criteria, have fetal outcome determined by karyotype and/or microarray, and have at least 1 plasma aliquot of ≥3.5 mL obtained from whole blood collected in a BCT tube (Streck, Omaha, NE). There was no sample selection preference based on high-risk indication. All subjects meeting these selection criteria with an abnormal fetal outcome as needed for the study were identified and pulled from inventory. These were then supplemented with randomly selected subjects with samples meeting the same selection criteria but with a normal karyotype to reach the total of 1222 samples for testing.
Library preparation, sequencing, and analytical methods
Libraries were prepared and quantified as described by Tynan et al. To reduce noise and increase signal, sequencing depth for this analysis was increased to target 32 million reads per sample. Sequencing reads were aligned to hg19 using Bowtie 2. The genome was then partitioned into 50-kbp nonoverlapping segments and the total number of reads per segment was determined, by counting the number of reads with 5’ ends overlapping with a segment. Segments with high read count variability or low mapability were excluded. The 50 kbp read counts were then normalized to remove coverage and guanine/cytosine biases and other higher-order artifacts using the methods previously described in Zhao et al.
The presence of fetal DNA was quantified using the regional counts of whole genome single-end sequencing data as described by Kim et al.
Genomewide detection of abnormalities
Circular binary segmentation (CBS) was used to identify CNVs throughout the entire genome by segmenting each chromosome into contiguous regions of equal copy number. A segment-merging algorithm was then used to compensate for oversegmentation by CBS when the signal-to-noise ratio was low. Z-scores were calculated for both CBS-identified CNVs and whole chromosome variants by comparing the signal amplitude with a reference set of samples in the same region. The measured Z-scores form part of an enhanced version of Chromosomal Aberration Decision Tree previously described in detail in Zhao et al.
To further improve specificity of CNV detection, bootstrap analysis was performed as an additional measure for the confidence of the candidate CNVs. The within sample read count variability was compared to a normal population (represented by 371 euploid samples) and quantified by bootstrap confidence level. To assess within sample variability, the bootstrap resampling described below was applied to every candidate CNV.
For each identified segment within the CNV, the median shift of segment fraction from the normal level across the chromosome was calculated. This median shift was then corrected to create a read count baseline for bootstrapping. Next, a bootstrapped segment of the same segment length as the candidate CNV was randomly sampled with replacement from the baseline read counts. The median shift was then applied to this bootstrapped fragment. The segment fraction of the bootstrapped fragment was calculated as follows:
This process was repeated 1000 times to generate a bootstrap distribution of segment fractions for an affected population. A normal reference distribution was created based on the segment fraction of the same location as the candidate CNV in 371 euploid samples. A threshold was then calculated as the segment fraction that was at least 3.95 median absolute deviations away from the median segment fraction of the reference distribution. Lastly, the bootstrap confidence level was calculated as the proportion of bootstrap segments whose fractions had absolute z-statistics above the significance threshold.
A whole chromosome or subchromosomal abnormality is detected as described in Zhao et al.
Materials and Methods
Study design
This blinded, retrospective clinical study included samples from women considered at increased risk for fetal aneuploidy based on advanced maternal age ≥35 years, a positive serum screen, an abnormal ultrasound finding, and/or a history of aneuploidy. Archived samples were selected for inclusion in the study by an unblinded internal third party according to the requirements documented in the study plan. Samples were then blind-coded to all operators and the analysts who processed the samples. After sequencing, an automated bioinformatics analysis was performed to detect whole chromosome aneuploidies and subchromosomal CNVs. Results were compiled electronically and were reviewed by a subject matter expert who assigned the final classification. This manual review mimics the process in the clinical laboratory, where cases are reviewed by a laboratory director before a result is signed out. Completed classification results were provided to the internal third party for determination of concordance. Analyzed samples had confirmation of positive or negative events by either G-band karyotype or microarray findings from samples collected through either chorionic villus sampling (CVS) or amniocentesis. Circulating cell-free “fetal” DNA is believed to originate largely from placental trophoblasts. Genetic differences between the fetus and the placenta can occur (eg, confined placental mosaicism), leading to discordance between NIPT results and cytogenetic studies on amniocytes or postnatally obtained samples. Results from CVS by chromosomal microarray were thus considered the most accurate ground truth. Therefore, discordant results originating from amniocytes (karyotype or microarray) were resolved by sequencing at high coverage (an average of 226 million reads per sample). Sequencing depth has been shown as the limiting factor in NIPT methods, with increased depth allowing improved detection of events in samples with lower fetal fractions or improved detection of smaller events. High coverage sequencing has been used in multiple studies to unambiguously identify subchromosomal events and was used here as a reference for performance evaluation in discrepant amniocentesis samples.
Details of the sample demographics are described in Table 1 . Indications for invasive testing are described in Table 2 .
Demographic | Median | Range |
---|---|---|
Maternal age, y (Ntotal = 1177) | 36.0 | 17.8–47 |
Gestational age, wk (Ntotal = 1183) | 17 | 8–38 |
Maternal weight, lb (Ntotal = 1168) | 150 | 93–366 |
Procedure | Percent | (Naffected/Ntotal) |
CVS | 14.5 | (175/1203) |
Amniocentesis | 85.2 | (1025/1203) |
Both | 0.2 | (3/1203) |
Confirmation | Percent | (Naffected/Ntotal) |
Karyotype | 90.4 | (1089/1205) |
Microarray | 5.8 | (70/1205) |
Both | 3.8 | (46/1205) |
Indication | Euploid, n = 1009 | T21, n = 87 | T18, n = 29 | T13, n = 15 | SCA, n = 26 | CNV, n = 44 | Total, n = 1210 |
---|---|---|---|---|---|---|---|
Positive serum screening | 404 (40%) | 36 (41.4%) | 13 (44.8%) | 4 (26.7%) | 3 (11.5%) | 16 (36.4%) | 476 (39.3%) |
Maternal age >35 y | 418 (41.4%) | 28 (32.2%) | 10 (34.5%) | 4 (26.7%) | 2 (7.7%) | 10 (22.7%) | 472 (39%) |
Ultrasound abnormality | 152 (15.1%) | 34 (39.1%) | 6 (20.7%) | 10 (66.7%) | 16 (61.5%) | 22 (50%) | 239 (19.8%) |
Family history | 33 (3.3%) | 4 (4.6%) | 1 (3.4%) | 0 (0%) | 1 (3.8%) | 1 (2.3%) | 40 (3.3%) |
Not specified | 85 (8.4%) | 11 (12.6%) | 5 (17.2%) | 2 (13.3%) | 4 (15.4%) | 2 (4.5%) | 109 (9%) |
Sample collection
In total, 1222 maternal plasma samples were previously collected using 4 investigational review board (IRB)-approved protocols with a small subset (9 samples) comprising remnant plasma samples collected from previously consented patients in accordance with the Food and Drug Administration guidance on informed consent for in vitro diagnostic devices using leftover human specimens that are not individually identifiable. Samples from 2 of the protocols (Compass IRB no. 00508 and Western IRB no. 20120148) were collected from high-risk pregnant subjects prior to undergoing a confirmatory invasive procedure (1189 samples). Samples from the other 2 protocols (Compass IRB no. 00351 and Columbia University IRB no. AAAN9002) came from subjects who were enrolled in the studies after receiving the fetal karyotype and/or microarray results of a confirmatory invasive procedure (24 samples). All subjects provided written informed consent prior to undergoing any study-related procedures. A total of 5321 high-risk subjects were recruited into the 4 clinical studies indicated above at the time of sample selection. To be eligible for inclusion into this study, subjects had to have met all protocol inclusion and no exclusion criteria, have fetal outcome determined by karyotype and/or microarray, and have at least 1 plasma aliquot of ≥3.5 mL obtained from whole blood collected in a BCT tube (Streck, Omaha, NE). There was no sample selection preference based on high-risk indication. All subjects meeting these selection criteria with an abnormal fetal outcome as needed for the study were identified and pulled from inventory. These were then supplemented with randomly selected subjects with samples meeting the same selection criteria but with a normal karyotype to reach the total of 1222 samples for testing.
Library preparation, sequencing, and analytical methods
Libraries were prepared and quantified as described by Tynan et al. To reduce noise and increase signal, sequencing depth for this analysis was increased to target 32 million reads per sample. Sequencing reads were aligned to hg19 using Bowtie 2. The genome was then partitioned into 50-kbp nonoverlapping segments and the total number of reads per segment was determined, by counting the number of reads with 5’ ends overlapping with a segment. Segments with high read count variability or low mapability were excluded. The 50 kbp read counts were then normalized to remove coverage and guanine/cytosine biases and other higher-order artifacts using the methods previously described in Zhao et al.
The presence of fetal DNA was quantified using the regional counts of whole genome single-end sequencing data as described by Kim et al.
Genomewide detection of abnormalities
Circular binary segmentation (CBS) was used to identify CNVs throughout the entire genome by segmenting each chromosome into contiguous regions of equal copy number. A segment-merging algorithm was then used to compensate for oversegmentation by CBS when the signal-to-noise ratio was low. Z-scores were calculated for both CBS-identified CNVs and whole chromosome variants by comparing the signal amplitude with a reference set of samples in the same region. The measured Z-scores form part of an enhanced version of Chromosomal Aberration Decision Tree previously described in detail in Zhao et al.
To further improve specificity of CNV detection, bootstrap analysis was performed as an additional measure for the confidence of the candidate CNVs. The within sample read count variability was compared to a normal population (represented by 371 euploid samples) and quantified by bootstrap confidence level. To assess within sample variability, the bootstrap resampling described below was applied to every candidate CNV.
For each identified segment within the CNV, the median shift of segment fraction from the normal level across the chromosome was calculated. This median shift was then corrected to create a read count baseline for bootstrapping. Next, a bootstrapped segment of the same segment length as the candidate CNV was randomly sampled with replacement from the baseline read counts. The median shift was then applied to this bootstrapped fragment. The segment fraction of the bootstrapped fragment was calculated as follows: