The Technology and Bioinformatics of Cell-Free DNA-Based NIPT




Abstract


The primary challenge of cell-free DNA (“cfDNA”) based prenatal testing (commonly called noninvasive prenatal testing or NIPT) is to identify fetal chromosomal anomalies from maternal plasma samples, where maternal cfDNA molecules far outnumber their fetal counterparts. Further complicating this task is the fact that the relative amounts of fetal and maternal fragments (i.e., the fetal fraction) varies across pregnancies and gestational ages, reaching as high as 40% and as low as 1%. Therefore, to achieve maximal sensitivity and specificity, a cfDNA test’s analysis pipeline must determine both the fetal fraction (“FF”) of a sample and the likelihood of an aneuploid fetus. Several elegant molecular and bioinformatic strategies have emerged to infer FF and ploidy status reliably and at low cost, yielding a range of cfDNA test offerings that hold in common a remarkably high clinical sensitivity that has driven the rapid adoption of this screening modality. In this chapter, the analysis fundamentals are described for three cfDNA test platforms: whole-genome sequencing (WGS), single-nucleotide polymorphism (SNP), and microarray. The methodologies of aneuploidy identification, FF inference, and microdeletion detection are discussed in turn. These topics provide a basic understanding of how cell-free DNA based prenatal testing works in routine pregnancies and lay the groundwork for a discussion at the end of the chapter about edge cases that will themselves become common as use of the screening grows.




Keywords

Whole-genome sequencing, Single-nucleotide polymorphism, Fetal fraction, z -Score, Sex-chromosome aneuploidies

 




Acknowledgments


Carrie Haverty provided helpful comments on the manuscript.




Introduction


The primary challenge of cell-free DNA (“cfDNA”) based prenatal testing (commonly called noninvasive prenatal testing or NIPT) is to identify fetal chromosomal anomalies from maternal plasma samples, where maternal cfDNA molecules far outnumber their fetal counterparts. Further complicating this task is the fact that the relative amounts of fetal and maternal fragments (i.e.; the fetal fraction) varies across pregnancies and gestational ages, reaching as high as 40% and as low as 1%. Therefore, to achieve maximal sensitivity and specificity, a cfDNA test’s analysis pipeline must determine both the fetal fraction (“FF”) of a sample and the likelihood of an aneuploid fetus. Several elegant molecular and bioinformatic strategies have emerged to infer FF and ploidy status reliably and at low cost, yielding a range of cfDNA test offerings that hold in common a remarkably high clinical sensitivity that has driven the rapid adoption of this screening modality. In this chapter, the analysis fundamentals are described for three cfDNA test platforms: whole-genome sequencing (WGS), single-nucleotide polymorphism (SNP), and microarray. The methodologies of aneuploidy identification, FF inference, and microdeletion detection are discussed in turn. These topics provide a basic understanding of how cell-free DNA based prenatal testing works in routine pregnancies and lay the groundwork for a discussion at the end of the chapter about edge cases that will themselves become common as use of the screening grows.




Aneuploidy Identification


WGS-Based NIPT


To help guide understanding of the algorithm underlying whole-genome sequencing (WGS)-based NIPT analysis , it is useful first to consider the ultimate goal of NIPT: robust identification of fetal copy-number anomalies in large genomic regions at low FF. The “copy-number anomalies” include trisomies and monosomies in the fetus; “large genomic regions” include whole chromosomes and relatively smaller variants like microdeletions, and we stipulate “low FF” because if the test is sensitive at low FF, then it will surely work well at high FF. This goal provides a framework to evaluate the challenges the analysis algorithm must overcome. For instance, in a pregnancy with 2% FF in which the fetus has trisomy 21—chr21 comprises 1.7% of the genome—only 1 in 3000 cfDNA fragments actually derives from the fetal copies of chr21 (1.7% ⁎ 2%) − 1 . Therefore to detect a 50% increase in the number of fetal copies of chr21 (i.e., from disomic to trisomic) with high statistical confidence, it is not sufficient to sequence thousands or even hundreds of thousands of random cfDNA fragments. Indeed, millions of sequenced cfDNA fragments are required.


The first step of the WGS analysis pipeline is to align the millions of sequenced cfDNA fragments (“reads”) to the human reference genome. Off-the-shelf software packages can map millions of reads to the 3-billion-base human reference genome in the order of minutes . Considering the goal of detecting a low-FF T21 sample, the key question is whether the number of reads mapping to chr21 is higher than expected. A naive approach would be to compare the level of chr21 reads to the reads mapped to a known disomic chromosome: for example, assume that there are two fetal copies of chr1 and use the number of reads mapped to chr1 as an estimate for the number of mapped reads to a disomic chr21 ( Fig. 1 A ). However, for an ordinary euploid sample, chr1 may have 1,000,000 reads while chr21 may have only 200,000, illustrating the folly of comparing read totals across chromosomes: chr1 trivially has ~ 10 × more than chr21 because it is ~ 10 × longer. Instead, for the comparisons of cfDNA abundance that underlie WGS-based analyses, it is best to use a length-agnostic measurement, such as the density of reads ( Fig. 1 B).




Fig. 1


Read density is preferable to total reads when comparing chromosome dosage. (A) Each box depicts a human chromosome with the width of the box proportional to the chromosome’s size. Even in the case of disomy of chr1 and chr21, the number of reads mapped to each chromosome can differ substantially. (B) Schematic of equal-size tiled bins across the genome. Bins are typically tens of kilobases in length, much smaller than shown. Comparisons of chromosome dosage are more straightforward after calculating bin density.


Read density in WGS-based NIPT algorithms is typically assessed by tiling each chromosome with nonoverlapping bins of equal size, counting the number of reads per bin, and averaging the reads per bin over a region of interest (e.g., a microdeletion or whole chromosome) . This process of averaging over bins requires two important choices when implementing the algorithm: the bin size to use and the method for calculating an average.


Bin size choice


There is no strictly optimal bin size, yet noteworthy factors inform the choice. One key factor in bin size choice is the total number of reads the sample receives, which was asserted above to be in the tens of millions. In general, total reads and bin size are inversely proportional, with deeper sequencing permitting smaller bins. Ideally, bins should be small enough to allow clear differentiation of aneuploidy-scale deviations in signal from large, spurious deviations that should be omitted from the dataset (e.g., biological events like maternal CNVs or analytical artifacts like alignment mistakes; Fig. 2 ). For instance, if bins are so small that the average bin count is only two reads, the discreteness of NGS reads means that it is not straightforward to interpret a bin with four reads as being from a normal, aneuploid, or maternal-CNV-harboring chromosome. However, with larger bins that have 50 reads on average, it is much simpler to distinguish a deflection consistent with aneuploidy (e.g., 55 reads) from a maternal-CNV-caused deviation (e.g., 100 reads).




Fig. 2


Bin-size choice facilitates identification of outliers. (A) There will naturally be bin-to-bin variability in the number of reads observed per bin, and this variability in observed reads scales as the square root of the average. With an average of 50, the noise is ~ 7 reads per bin, making it straightforward—even at the level of a single bin—to observe and remove large outliers (B), e.g., caused by maternal CNVs.


While the discretization of individual reads described previously provides an upward pressure on bin size, the existence of localized, anomalous genomic regions exerts a downward pressure that supports the use of smaller bins. These anomalous regions could be bins or portions of bins that have stochastic, idiosyncratic, and/or highly skewed levels of mapped reads; even very high-quality samples contain such regions. Fortunately, such regions are rare across the entire genome, yet they deviate greatly from one of the key underlying assumption of WGS, that is, that cfDNA fragments are uniformly sampled. In practice, the few anomalous bins are discarded from the analysis, demonstrating one reason why a large number of relatively small bins are beneficial. To see another reason, consider a 5-kb segment that is anomalous and rightfully caused its encompassing bin to be discarded. If bins were 20 kb, only 15 kb worth of nonanomalous signal would effectively be lost due to the anomaly. However, if bins were 100 kb, then 95 kb of valid sequence would be lost, illustrating the potentially substantial loss of signal resulting from using large bins. To see this argument from another perspective, suppose small anomalous regions occur every ~ 100 kb. If bin size were set to 200 kb, then nearly every bin would harbor signal-compromising anomalies that jeopardize the ability to accurately identify aneuploidy.


Consistent with the two arguments previously—which exert upward and downward forces on bin size—most WGS-based NIPT algorithms described in the literature use 50 kb bins.


Calculating an average bin value for a region and sources of error


Though it sounds straightforward to calculate an average of reads per bin across the many bins that tile a genomic region of interest, a lot of algorithmic nuance is required to maximize the screen’s sensitivity and specificity given a sample’s sequenced cfDNA fragments. To appreciate the performance gains from special algorithmic care, recall that WGS-based NIPT is a sampling problem: the sequenced reads in each bin are simply a sampling of the fragments present in the maternal plasma, which are themselves only samples from apoptotic cells in the placenta and other parts of the body. As in any sampling problem (e.g., polling prospective voters before an election), in WGS-based NIPT there is a margin of error by which the observed average reads per bin differ from its underlying and typically unobservable true value. Note that if the error in the calculated average reads per bin is too large, it causes false ploidy calls, thereby undercutting the goal of NIPT. The error can be represented as the standard error, σ /sqrt( N ), where σ is the standard deviation of reads per bin, and N is the number of bins. As N is an immutable property of a chromosome and the selected bin size, it is the standard deviation of reads per bin ( σ ) that drives the error of the observed bin count average. The aim, then, of a well-crafted WGS-based NIPT algorithm is to remove sources of bias that inflate σ such that it can descend to its theoretical minimum (which, by Poisson statistics, is σ ~ sqrt(reads per bin)) .


There are multiple sources of bias in WGS data, but methods exist to remove them or mitigate their effects. As described later, some biases (e.g., GC bias, nonunique sequence, etc.) have known molecular underpinnings and can be corrected with approaches tailored to their origin, while other biases may not have a known source but can be diminished through general robust analysis methods.


Sample-to-sample differences in the propensity with which fragments of particular GC content get sequenced can inflate σ above its theoretical minimum . This so-called GC bias arises due to the different thermodynamics of G:C base pairs, which have three hydrogen bonds, and A:T base pairs, which have two. GC bias can be introduced at any level of the NIPT workflow, including DNA extraction, NGS library preparation, and sequencing itself. To appreciate the manifestation of GC bias in WGS data, recall that the expectation is for WGS to yield a uniform sampling of reads across the genome. However, the empirical distribution of reads is not strictly uniform and instead correlates in part with GC content: cfDNA fragments that have high (~ 80%) or low (~ 20%) GC content often generate fewer reads than expected relative to fragments with average GC content (40%–60%). To correct for GC bias, the algorithm should first assess the extent of bias on a sample-specific basis by calculating the ratio between the observed number of reads with a given GC content and the number of such fragments in the whole genome ( Fig. 3 ). Next, observed fragments can be scaled by their particular GC content and known GC bias: for example, if fragments with 20% GC content were observed to be sequenced with only 50% relative efficiency, then each read observed from a fragment with 20% GC content should be scaled by 1/0.5 = 2.




Fig. 3


Correcting for GC and mappability biases. (A) Despite an expectation of uniform coverage, sequenced cfDNA fragments map with nonuniform density ( bottom ), driven in part by GC bias ( top ). (B) Correction for GC bias involves scaling the observed reads by a correction factor derived from the aggregate GC bias plot (show in A, top ); e.g., in red-shaded regions with low GC content and normalized abundance near 50%, each observed read is scaled by 2 (i.e., 1/0.5). (C) Some residual coverage gaps after GC-bias correction stem from mappability, where “mappable” positions are those where fragments align uniquely. Scaling a bin’s reads by the reciprocal of mappability increase the uniformity of coverage.


Redundant regions of the genome can also cause nonuniformity in WGS coverage and reduce NIPT accuracy if unaddressed . In redundant regions, it is impossible for the NGS read alignment software to infer the true genomic origin of a fragment. As such, even if a region occurs thousands of times in the genome (e.g., Alu repeats), the aligner may pile reads derived from all of those regions into just a single region; this skewed parsing of reads could create a single bin with a tremendously high number of reads, and many other bins with depressed read values. One way to address degenerate read mappability is to flag redundant regions in advance, mask them altogether to prohibit mapped reads from counting toward the bin total, and then scale the observed reads in a region by the reciprocal of its share of mappable bases (e.g., a bin with 60 reads that is 75% mappable would be scaled to have 80 reads) ( Fig. 3 ).


Maternal copy-number variants (“CNVs”) are another source of potential bias in WGS NIPT data , and their clear biological origin enables specific handling during analysis. In particular, maternal CNVs have a deflection in reads per bin that is very large but also predictable, 50% more or less than the disomic baseline. If such a deflection were of fetal origin, it would be consistent with 100% FF, which is clearly impossible. Maternal CNVs’ predictable signal makes it possible to computationally scan across the genome and identify contiguous spans of bins where the bin counts are increased or decreased by 50%. These bins can then be omitted from the calculation of the region average. Some of the early WGS-based NIPT algorithms did not attempt to identify maternal CNVs, meaning their constituent bins were not omitted from subsequent analysis. These strongly skewed bins increased the mean bin value on CNV-harboring chromosomes to the point that false-positive aneuploidy calls were yielded in several cases. These false-positive results revealed not only a shortcoming in maternal CNV handling, but also a suboptimal, nonrobust method of calculating the average (i.e., the mean), described further later.


Even if a source of bias has unknown molecular origin, its impact can be mitigated if the bias is systematic. For instance, if a bin has elevated reads relative to other bins reproducibly across all samples and with no obvious explanation, it can nevertheless be included in the analysis after a normalization step: for example, by removing signal from the first several principal components or, more simply, by calculating the median number of reads for the bin across a large sample cohort (where the median sample can be assumed to be disomic) and then subtracting this average from the bin in all samples. Applying this normalization procedure across all bins—whether or not they are outliers—reduces the systematic variance in the data and ultimately serves to reduce σ .


Finally, because any outlying bins that were not filtered out using the previous strategies could yield false results, it is important to reject such bins and/or use robust measures for both the average and the dispersion of bin density ( Fig. 4 ). In particular, for the average, it is best to use the median rather than the mean, as the median will not be skewed by a strong outlier, caused by biological phenomena (e.g., a maternal CNV) or analytical artifacts (e.g., alignment mistakes). For the dispersion, it is best to avoid the standard deviation, as it is susceptible to outliers; the interquartile range (IQR) is a more robust estimator of dispersion. To see the power of using the median and IQR relative to the mean and standard deviation, consider an analysis pipeline that was unaware of the existence of maternal CNVs (mCNVs). With the mean and standard deviation, an mCNV spanning ~ 2% of a chromosome would be enough to strongly risk a false-positive result, whereas using the median and IQR does not risk a false positive until an mCNV spans ~ 30% of the chromosome.




Fig. 4


Outlier-insensitive measures are important for NIPT analysis robustness. (A) Distributions of bin depths (see Figs. 2 and 3 for depiction of bin depth) are shown for three simulated scenarios, each centered at a copy number of two, which assumes the maternal background is disomic. Fetal trisomy shifts the distribution rightward in proportion to the sample’s FF. A disomic sample harboring an mCNV ( blue ) has a minority of bins at highly elevated depth corresponding to CN ~ 3. (B) Troublingly, the disomic + mCNV case has the same mean as the trisomic case, and there is a large increase in the standard deviation relative to the disomic and trisomic cases. By contrast, the median appropriately has a large deflection for the trisomic sample but not the disomic + mCNV sample. Additionally, the relative increase in IQR is far less than the relative change in standard deviation.


Taken together, the processes of aligning reads to bins, correcting for GC bias and mappability, filtering outlier bins, and averaging in a robust manner yield an observed read density for a region of interest ( μ obs ), but deducing an aneuploidy call from μ obs requires a couple other pieces of information. In particular, to say whether μ obs is statistically higher than expected, it is critical to know both the expected density under a disomic hypothesis ( μ exp ) and the average deviation between μ obs and μ exp , such that assessment of a statistically significant difference is possible.


Expected read density


There are multiple ways to calculate μ exp , with some methods being sample specific, others being region specific, and still others leveraging data across both samples and regions. In a sample-specific approach from the early days of WGS-based NIPT, a so-called reference chromosome was assigned to each chromosome of interest , where the pair of chromosomes had similar features, for example, GC content. On the assumption that the reference chromosome k was disomic, then the expected read density for the region of interest i in sample j ( μ exp, i,j ) was assumed to be equal to the reference chromosome density ( μ obs, k,j ). However, this approach has the shortcoming of putting all the eggs in one basket, so to speak: an uncorrected anomaly on the reference chromosome could corrupt aneuploidy calling for the chromosome of interest. A further downside of the reference chromosome approach is that it only leverages a small subset of the data gathered from sequencing the whole genome, as it only considers a minority of 22 chromosomes (exempting the chromosome of interest itself). This one limitation reveals yet another: with a sample-specific approach, there is only a limited number of potential reference chromosomes, and there may not be a suitable reference for a particular chromosome of interest.


A pan-sample, single-region approach obviates the concern about comparing to a reference region disparate from the chromosome of interest, because all comparisons are relative to the lone chromosome of interest . Specifically, μ exp, i , j for region i in sample j is the average of μ obs, i,k for region i in all samples k j . This approach must be exercised with caution for two main reasons. First, recall that the chromosome of interest is itself rendered interesting because it has elevated incidence of aneuploidy; thus there may be aneuploid samples in the background set (i.e., the samples k j ) that could skew a calculation of the average of μ obs, i,k . Ideally, the algorithm should identify and exclude such samples, but at the very least, using a robust estimator like the median to calculate the average of μ obs, i,k is important. Second, this approach does not account for sample-specific deviations from the background cohort. For instance, even with seamless adherence to laboratory best practices, there are sometimes rare samples whose distribution of sequenced fragments across the genome differs from that of the majority of samples. If the background majority of samples has relatively homogeneous read density in the region of interest (i.e., uniform μ obs, i ), but sample j systematically deviates from the background, sample j could spuriously appear aneuploid. Therefore like the sample-specific approach described earlier, the single-region method has shortcomings.


In principle, the most powerful and robust way to calculate μ exp, i,j is to leverage all regions of interest in all samples. Such an approach would harness the virtue of region-specific calculations (e.g., using chr21 read density in background samples to infer chr21 in the sample of interest) while also accounting for sample-specific effects (e.g., if the sample of interest deviates from the background cohort). A machine-learning model (e.g., linear regression) can serve this dual purpose . The training set for such a model is a matrix of μ obs, i,j for many regions i and samples j . For a given region of interest i , the model determines how best to weight all other regions k i in order to best predict region i across background samples. In a way, the regression model can be considered an expansion of the reference chromosome method. The latter effectively used a weight of 1.0 for the reference chromosome, whereas the former might use weights of 0.12 for chr1, − 0.05 for chr2, 0.2 for chr3, and so on, where weights are derived to yield the best prediction for the chromosome of interest across samples. Once the model learns this optimal weighting, it can predict the expected read density for the region i in the sample of interest j by calculating a weighted sum based on the other regions in sample j (e.g., 0.12 ⁎ μ obs,chr1, j + (− 0.05) ⁎ μ obs,chr2, j + 0.2 ⁎ μ obs,chr3, j + ⋯).


Calculating and interpreting the z -score


Beyond determining the observed bin density and the expected disomic density as described previously, one final and relatively straightforward calculation is needed to assess aneuploidy: the average difference between the expected and observed densities across a large cohort of samples. Like most other calculations in NIPT, this expected deviation ( σ O-E ) should also be robust to outliers—for example, by using the IQR or median absolute deviation rather than the standard deviation—because high-FF aneuploid samples will manifest as outliers, where observed density far exceeds disomic expectation. Once μ obs , μ exp , and σ O-E have been determined, it is possible to calculate the z -score:


<SPAN role=presentation tabIndex=0 id=MathJax-Element-1-Frame class=MathJax style="POSITION: relative" data-mathml='z=μobs−μexp/σO−E’>?=(?obs?exp)/???z=μobs−μexp/σO−E
z = μ obs − μ exp / σ O − E


z -Scores near zero suggest that the sample does not deviate from the disomic hypothesis. By contrast, samples with high-magnitude z -scores are consistent with aneuploidy: trisomy if the z -score is highly positive and monosomy if the z -score is strongly negative.


The chosen threshold, or cutoff, between euploidy and aneuploidy has a large impact on the sensitivity and specificity of the test. In general, low thresholds sacrifice specificity to achieve high sensitivity, whereas high thresholds prioritize specificity over sensitivity.


Assuming that z -scores of truly disomic samples are normally distributed, a z -score threshold of three equates to a minimum expected specificity of ~ 99.9% (i.e., one false positive per 1000 samples). It is possible to achieve even higher levels of specificity by making the euploid-to-aneuploid cutoff aware of sample-specific FF . To see this, recall that μ obs in an aneuploid sample scales with FF (e.g., a high FF trisomic sample has very large deviation between μ obs and μ exp and, thus should yield a very high z -score). Thus if the sample has a known FF value that should give z ~ 20 when trisomic, then it is no longer prudent to have a z -score cutoff of three. The threshold could be raised to, say, five, which (1) boosts specificity by almost eliminating the possibility of a disomic sample being a false positive by chance, and (2) has virtually no downward impact on sensitivity, since the z -score of an aneuploid sample still vastly exceeds the cutoff.


Sex chromosome aneuploidies


Whereas detecting autosomal aneuploidies is a one-dimensional problem that primarily focuses on the change in depth of a single chromosome, identifying sex-chromosome aneuploidies requires a two-dimensional analysis with simultaneous consideration of chrX and chrY . For instance, the difference between an MX female and a euploid XY male cannot be deciphered from chrX alone, as both samples are effectively monosomy on chrX. But, they are distinguished by chrY: the MX sample has effectively zero depth from chrY, whereas the XY sample has an increase in chrY signal comparable to the deficit in chrX signal. Determining ploidy status on females—that is, MX, XX, or XXX—is effectively an autosome-like, one-dimensional analysis once it is established that chrY is absent.


For males, however, who may be XXY, XYY, or XY, ploidy-status resolution is more complicated, as samples’ positions in the chrX-vs-chrY 2D space become relevant. Assigning genotypes depends on how closely a given sample’s position in the plot corresponds to each respective genotype’s hypothetical region. XXY samples have no depletion in chrX relative to the maternal background, though they have significant presence of chrY. As such, they are expected to occupy the territory on a vertical line emanating from the origin. To determine where XYY samples are expected to be in the chrX-vs-chrY plot, the scaling relationship between chrX and chrY signal (i.e., the slope of the blue region in Fig. 5 ) must be determined for XY males, which are in abundance. Next, the expected chrX-vs-chrY relationship for XYY can be calculated, for example, by doubling the slope for XY males. Finally, each sample must be assigned to a ploidy status based on its position relative to each hypothesis line. Distinguishing the three male genotypes is challenging at very low fetal fraction (i.e., near the origin in Fig. 5 ) because the hypotheses converge. As XY samples far outnumber XXY and XYY in the general population, it is expected that samples near the origin are XY, and they are, therefore typically reported as such. Though this approach ensures high specificity for sex calling, it necessarily reduces the sensitivity for XXY and XYY at low fetal fraction, a limitation that applies to all forms of NIPT.




Fig. 5


Sex-chromosome ploidy status determined by position in 2D plot of chrX vs chrY. The normalized bin depths calculated from chrX and chrY for each sample determine the sample’s position in the previous 2D plot, where schematized regions indicate fetal genotypes. Being near the origin means that the fetal genotype matches the maternal background, i.e., XX. The chrY values in all males and the chrX values in aneuploid females (i.e., MX or XXX) emanate away from the origin in proportion to FF. Each NIPT algorithm must determine how to assign genotypes in both the overlapping regions near the origin and the white space between regions.


Single-Nucleotide Polymorphism-Based NIPT


Trisomy causes an overrepresentation of a particular genomic region, and this excess DNA can create different signals in NGS data that can be measured during NIPT. The most obvious is an increase in the number of cfDNA fragments deriving from the aneuploid region; indeed, this NGS depth signal is the one that underlies WGS-based NIPT. But, the excess fetal cfDNA fragments can also alter the allele balance at particular sites. For instance, whereas a G/T SNP (single-nucleotide polymorphism) in the maternal genome has a ~ 50% allele balance, the contribution of the fetal genome in a pregnant mother may skew the allele balance away from 50%, for example, if the fetus is G/G at the same site. Such deflections in allele balance can be analyzed to infer ploidy via the SNP-based method of NIPT ( Fig. 6 ) .




Fig. 6


The SNP method detects aneuploidy via deflections in allele balances. The left panel shows a disomic fetus, and the right panel shows a trisomic fetus. In a 20% fetal-fraction pregnancy, the majority of cfDNA fragments ( lines at top ) are maternally derived ( black ), with the minority coming from the placenta ( blue ). Only a subset of cfDNA fragments interrogate SNP sites, i.e., those positions depicted as having A and B alleles. Via multiplex PCR, the SNP method amplifies only the cfDNA fragments containing SNPs and sequences them to yield an allele balance measurement, e.g., the fraction of A bases at a site. Each spot in the bottom panels represents a different SNP, with shading corresponding to the maternal genotype and distributions of allele balances shown at the right of each plot. In the case of a paternally inherited trisomy ( right side of figure ; note extra B allele at a site where the mother is homozygous for A), the allele-balance distributions shift significantly relative to the disomic case ( left side of figure ). The SNP quantifies the magnitude of such signal shifts to detect aneuploidy.

Adapted from Artieri CG, Haverty C, Evans EA, Goldberg JD, Haque IS, Yaron Y, et al. Noninvasive prenatal screening at low fetal fraction: comparing whole-genome sequencing and single-nucleotide polymorphism methods. Prenat Diagn 2017;37(5):482–90.


SNP-based NIPT requires allele balance measurements across thousands of sites that tile regions of interest. These requirements pose two important challenges: (1) how to measure allele balance from cfDNA and (2) which sites should be interrogated.


The need to measure allele balance means that a fundamentally different technique of NGS library preparation is required for SNP-based NIPT relative to WGS-based NIPT. With WGS library preparation and at the level of sequencing required for WGS-based NIPT, only ~ 20% of genomic sites have even a single read covering them, making it virtually meaningless to assess the fraction of alleles at a given site: no fraction can be measured at the 80% of sites with no coverage, and the allele balance is either 0% or 100% at sites with only a single NGS read. Allele balance could be coarsely measured if WGS depth were increased 100-fold (i.e., giving ~ 20 × everywhere), but the cost of the test would also increase to a prohibitive level ~ 100-fold higher. The solution for getting high depth at particular sites affordably is to perform a multiplex PCR enrichment of targeted sites. A single multiplex PCR reaction—containing many primer pairs that flank particular bases of interest—can amplify thousands of unique genomic locations, yielding an NGS library from which the depth at each site can affordably be in the hundreds. With hundreds of reads at a given site, measuring the allele balance is both meaningful and easy.


The targeted nature of SNP-based NIPT requires judicious choice of which sites to amplify. As described in more detail later, the most informative sites are highly polymorphic in the population, as the allele balance deflections associated with aneuploidy can only be measured where the mother and/or the fetus is heterozygous. To see why only these highly diverse sites are useful, consider the example in Fig. 6 . For the indicated site, the mother is homozygous for the reference allele (by convention, reference alleles at all sites are simply called “A” and nonreference are called “B”), and the disomic fetus is heterozygous (“A/B”). In a 20% FF sample, the allele balance (fraction of A bases) of maternal and fetal cfDNA together is 90% (where the 10% drop in A is due to the single B allele in the fetus). If the fetus has a paternally inherited trisomy where the father contributes two copies of the B-harboring chromosome, however, then the allele balance shifts to 81%. Measuring such shifts across many sites can assess the likelihood of euploidy vs aneuploidy, discussed further later. This example highlights why polymorphism is important for the SNP method: if the mother and fetus were both homozygous A/A, there is no difference in the allele balance between fetal disomy and trisomy, as both would be 100% A.


The need for heterozygosity has several important implications for SNP-based NIPT. First, unlike WGS-based NIPT, where the number of informative observations (i.e., the bins discussed earlier) is a constant function of chromosome size, the number of informative SNPs will differ with each pregnancy based on the genotype of mother and father. In particular, useful SNP data becomes sparse as relatedness between the parents increases; at the extreme, SNP-based NIPT is poorly suited to consanguineous couples, since fewer SNP positions are divergent between the mother and fetus than in nonconsanguineous couples. Second, the ethnicity of the mother and father can influence the number of informative SNPs, as a highly polymorphic site in patients of European descent may not be polymorphic in people of Asian descent. In sum, due to the variable number of SNPs per pregnancy, the sensitivity of SNP-based NIPT is intimately tied not only to fetal fraction (as is the case for WGS-based NIPT), but also to the ethnicity and relatedness of the parents. In light of these factors, careful quality control is required to ensure confident results.


Finally, the deflections in allele balance at the core of SNP-based NIPT analysis are highly dependent on both the parental and meiotic origin of an aneuploidy . Fig. 7 depicts SNP-based NIPT data for five pregnancies, each with 10% FF. The pattern of SNP allele balances in paternally inherited trisomies—which account for ~ 10% of all T21 cases—is conspicuously different from the disomic case (e.g., note downward shift of red SNPs, upward shift of blue SNPs, and dilation of the green region), and such trisomies are handily detected with very high confidence. The far more common maternally inherited trisomies, however, deviate less from the disomic case (especially trisomies originating in the M1 phase of meiosis, which account for 70% of T21 cases), and these trisomies are more challenging for SNP-based NIPT to detect, especially at low FF .




Fig. 7


SNP method analysis depends on the parental and meiotic origins of trisomy. All five panels depict a modeled pregnancy with 10% fetal fraction. For the four aneuploidy scenarios at right, the parent of origin (maternal or paternal) and meiotic stage of origin (M1 or M2) are accompanied by the frequency of such events in parentheses. Allele balance distributions in the paternal M2 trisomy differs conspicuously from the disomic case, consistent with very high sensitivity for such trisomies. The more common maternal M1 trisomies, however, are more challenging to detect by eye and, consequently, by the algorithm as well.

Adapted from Artieri CG, Haverty C, Evans EA, Goldberg JD, Haque IS, Yaron Y, et al. Noninvasive prenatal screening at low fetal fraction: comparing whole-genome sequencing and single-nucleotide polymorphism methods. Prenat Diagn 2017;37(5):482–90.


Quantifying aneuploidy likelihood in SNP-based NIPT


SNP-based NIPT detects aneuploidy by enumerating various ploidy hypotheses and evaluating which is the most likely given the observed data . The root hypotheses include disomy, maternally inherited M1 trisomy (one copy of each maternal chromosomes inherited), maternally inherited M2 trisomy (two copies of a single maternal chromosome inherited), paternally inherited M1 trisomy, and paternally inherited M2 trisomy. Due to the possibility of recombination during meiosis 1, however, the algorithm must entertain a combinatorial exploration of hypotheses: for example, a paternally inherited trisomic chromosome could actually switch between having M1-like spans (both paternal alleles present) and M2-like spans (two copies of a single paternal allele present). The likelihood of each hypothesis is evaluated across the many SNPs in the dataset, either at the single SNP level or as small groups of frequently coinherited SNPs called haplotypes . Following is a description of how the single-SNP math works. The exact method for analyzing haplotypes has not been described in the literature and is therefore omitted here. But, the key principle is highly similar between analyzing single SNPs and haplotype blocks. The key difference is that haplotypes simply allow particular hypotheses under evaluation to be better defined, which could boost signal if the parental genomes indeed have particular haplotypes and could, conversely, have minimal impact on sensitivity if the haplotype is absent or interrupted.


Consider a single site at which a 94% allele balance (fraction of A alleles) is observed among 50 NGS reads (47 A alleles, and 3 B alleles) using SNP-based NIPT in a pregnancy with 10% FF. Further assume that the population frequency of the A and B alleles in the population is 50%. The goal is to evaluate which ploidy hypothesis best applies. Mathematically, this can be expressed as:


<SPAN role=presentation tabIndex=0 id=MathJax-Element-2-Frame class=MathJax style="POSITION: relative" data-mathml='p94%gtMgtF10%FFploidy_hypothesis50×’>?(94%,gt?,gt?,10%FF,ploidy_hypothesis,50×)p94%gtMgtF10%FFploidy_hypothesis50×
p 94 % gt M gt F 10 % FF ploidy _ hypothesis 50 ×
where gt M and gt F are the maternal and fetal genotypes, respectively. Because the maternal genome is predominant in cfDNA, the observation of an allele balance near 100% strongly suggests that gt M is AA. As the fetal genotype is unknown, the probability calculation must take the weighted sum of probabilities across potential fetal genotypes, altering the earlier equation to yield:
<SPAN role=presentation tabIndex=0 id=MathJax-Element-3-Frame class=MathJax style="POSITION: relative" data-mathml='∑gtF∈AAABBBp94%AAMgtF10%FFploidy_hypothesis50×∗pgtFAAMgtPploidy_hypothesis’>???(AA,AB,BB)?(94%,AA?,gt?,10%FF,ploidy_hypothesis,50×)?(gt?,AA?,gt?,ploidy_hypothesis)∑gtF∈AAABBBp94%AAMgtF10%FFploidy_hypothesis50×∗pgtFAAMgtPploidy_hypothesis
∑ g t F ∈ AA AB BB p 94 % AA M gt F 10 % FF ploidy _ hypothesis 50 × ∗ p gt F AA M gt P ploidy _ hypothesis

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Jun 26, 2019 | Posted by in GYNECOLOGY | Comments Off on The Technology and Bioinformatics of Cell-Free DNA-Based NIPT

Full access? Get Clinical Tree

Get Clinical Tree app for offline access