Introduction
There has been tremendous progress in our attempt to discern the molecular basis of infectious diseases, yet several gaps remain both in the understanding of disease processes and in the development of optimal strategies that would allow early diagnosis and targeted treatment. In addition, despite major advances in the development and implementation of vaccines and antimicrobial agents, infectious diseases continue to represent a major cause of morbidity and mortality worldwide. Recent examples such as the 2009 H1N1 influenza pandemic, the MERS coronavirus and enterovirus-D68 outbreaks, the West Africa Ebola and the Zika virus epidemics, as well as the increased frequency of hospital-acquired infections caused by multiple-resistant gram-negative bacilli and highly virulent strains of Clostridium difficile highlight the challenges we encounter when managing patients with infectious diseases. In this context of outbreaks of emergent and reemergent pathogens linked to increased antimicrobial resistance, there is a clear need for improved diagnostic tools for optimal patient classification and management.
Host Responses for Improving the Diagnosis of Infectious Diseases
One of the most frequent challenges that physicians face in the clinical setting is the difficulty of establishing an appropriate etiologic diagnosis or even distinguishing between bacterial or viral infections in patients presenting with an acute febrile illness. These obstacles can delay initiation of appropriate therapy, which can result in unnecessary morbidity and even mortality. On the other hand, the need to promptly start appropriate antimicrobial therapy to control the infection has to be balanced with a rational use of antibiotics. Within this context there is an obvious need for improved diagnostics tools to help with patient classification, which in turn should allow appropriate use of targeted therapies.
Microbial pathogens are detected in clinically relevant specimens using a variety of assays including cultures, rapid antigen detection tests, and polymer chain reaction (PCR) assays. To date, to be able to establish causality, growing the specific pathogen (bacteria, virus, or fungus) remains the gold standard. However, this is a flawed approach, particularly if the organism is not present in the blood or from other easily accessible sites. In addition many pathogens grow slowly or require complex media, and a significant number of clinically important microbes remain unrecognized because they are resistant to cultivation in the laboratory, thus limiting clinical decision-making. The introduction of more sensitive molecular diagnostic assays has significantly improved the diagnosis of viral infections. Unfortunately this is not the case for bacterial pathogens. Moreover, in the clinical setting, it is not uncommon to encounter situations in which the sole identification of a pathogen is not sufficient to establish causality (e.g., the detection of respiratory viruses in patients with no respiratory symptoms or in patients with pneumonia who often also have a co-detected bacterial pathogen).
In view of these limitations, for almost a century, there has been a large quest to identify host-derived biomarkers indicative of infection, such as the erythrocyte sedimentation rate (ESR) or C-reactive protein (CRP). These tests, which are useful in certain clinical scenarios, have proved to be nonspecific and are unable to differentiate between pathogen types (i.e., viral vs. bacterial) or even between infectious and noninfectious diseases. More recently, procalcitonin (PCT), a 116-amino acid protein produced in the thyroid and lungs, has shown improved sensitivity and specificity for the diagnosis of bacterial infections. Nevertheless, there are limitations and uncertainty about its utility because serum concentrations of PCT also increase after surgery, trauma, cancer, or severe burns, thus raising the concern of false positives. Other candidate biomarkers have been used for the diagnosis of neonates and older children with sepsis and have produced inconsistent results because data have yet to be validated in independent cohorts.
There is a need for an alternative strategy that has sufficient sensitivity to differentiate infectious from noninfectious conditions, sufficient specificity to distinguish among the different types of pathogens, useful to monitor response to therapy, and, ideally, is able to predict clinical outcomes. An alternative approach to the pathogen-detection strategy is based on a comprehensive analysis of the host response to the infection caused by different pathogens ( Fig. 3.1 ). A wide range of molecular and cellular profiling assays are currently available for the study of the human immune system. Genomics provide information about structural DNA changes and thus the probability of developing a condition, epigenetics describe the chromatin modifications that are caused by external or environmental factors and stably alter gene expression without changing the DNA sequence, transcriptomics study the overexpression or underexpression of genes (mRNA expression profiles) in a qualitative and quantitative manner in response to the infection, and metabolomics and proteomics analyze the structure, function, and interaction of posttranslational metabolites produced by a particular gene ( Fig. 3.2 ). Thus the information provided by the “-omics” technologies is complementary, and their use for diagnostic, pathogenetic, or prognostic purposes is mainly limited by the available technology and complexity of the analyses ( Fig. 3.3 ).
Independent of the -omics approach used, four tenets must be considered when using these tools for biomarker discovery to assure that the profiles are representative of the disease process and not of a confounding event: (a) selection and definition of the cases, which should be homogeneous in terms of the disease process and with limited confounders to allow interpretation of the multidimensional data; (b) need for controls, which should be also homogeneous, free of confounders, and similar in terms of the basic characteristics (i.e., demographic parameters) with the cases; (c) type of sample, which should reflect and change because of the biologic process and should be easy to obtain, ideally in a noninvasive manner; and (d) need for validation, to confirm that the profiles identified perform well in an independent cohort of patients that is different from the one used for the discovery phase. Of all these technologies, genomics and transcriptomics are moving into the clinical laboratory and are poised to become part of routine diagnostics in the next few years. In this chapter, we will review the application of analysis of host response through genomics, epigenetics, transcriptomics, proteomics, and metabolomics for diagnosis, understanding disease pathogenesis, patient classification and management, and possibly prognosis of pediatric patients with infectious diseases.
Genomics
The human genome, which is relatively static, is organized into 46 chromosomes consisting of 22 pairs of autosomal chromosomes shared by males and females and the sex-determining chromosomes, X and Y. One set of autosomal chromosomes is derived from each parent. Human genes are formed by exons , which are the coding regions, and introns , the noncoding regions. During transcription, the entire gene is copied into pre-mRNA, which includes exons and introns. Through the process of RNA splicing, introns are removed and exons joined to form a contiguous coding sequence. Single genes are able to generate 4 to 6 different mRNAs; thus many of the complex biological functions that characterize humans are generated by combined interactions among genes rather than a specific gene being responsible for a specific function. The genome and transcriptome consist entirely of deoxyribonucleic (DNA) and ribonucleic acids (RNA). Their uniform chemical properties have enabled efficient, low-cost, and high-throughput methods for amplification, synthesis, sequencing, and highly multiplexed analysis.
The genome represents a rich source of information about our pathophysiology. Genome analyses provide information about structural DNA changes and thus the probability of developing a condition. These analyses do not, however, provide information about whether and when a condition will manifest. Many diseases are caused by genetic mutations, and many more manifest as a genetic predisposition. More than 3000 gene mutations ( www.omim.org/ ) have now been identified that are associated with more than 5000 human phenotypes that cause or predispose to diseases. These numbers suggest that many diseases are caused by mutations in single genes and that many more have an inheritable genetic component. The Human Genome Project (HGP) was an international research effort originated in 1990 that culminated in the identification and public release of the completed sequence of the human genome in April 2003. In the HGP, the genome was cloned first and then larger clones were divided into shorter pieces and sequenced. The HGP has revealed approximately 20,500 human genes. They are contained in more than 2.85 billion nucleotides covering more than 99% of the euchromatin (i.e., gene-containing DNA). Initial genome approaches were applied to the diagnosis of congenital birth defects and tumors. However, the introduction of next-generation DNA sequencing (NGS) has revolutionized biomedical research and promises to be of great value for the diagnosis of infectious diseases.
Basics of the Genomics Approach
We now have a broad arsenal of techniques for genome analysis at our disposal, which allow the detection of gross abnormalities down to single nucleotide changes. These tools are increasingly being used for clinical diagnostics. Within a few years the study of the human genome has dramatically changed and greatly improved, moving from the identification of abnormalities based on the morphology and number of chromosomes (karyotype) to the newly developed sequencing instruments that are able to generate millions of short sequences per run (NGS).
Karyotyping was the first method used for the identification of chromosomal abnormalities. Developed in the early 1960s, it is based on the identification of the banding pattern characteristic for each chromosome visible through the light microscope. Although it only reveals crude information, such as number, shapes, and gross alterations of general chromosomal architecture, it remains a mainstay of clinical genetic analysis. Trisomy 21 and chronic myelogenous leukemia were originally identified using this technique.
Comparative genome hybridization is a cytogenetic method focused on copy number variations relative to the number of chromosomes (ploidy) in the DNA. In comparative genome hybridization, the genomes of interest, which are usually a disease genome set against a normal control genome, are labeled with different fluorescent dyes and compared. Using different colored fluorescent labels, several genes can be stained simultaneously. When the technology was developed the fluorescently labeled DNAs were hybridized to a spread of normal chromosomes and evaluated by quantitative image analysis, which was able to detect chromosome regional gains or losses with greater accuracy than conventional karyotyping. Further improvements in resolution have been achieved using microarray-based comparative genome hybridization methods in which the probe DNA can be amplified by PCR, thus only minute amounts of starting material are required. The labeled DNA is then hybridized to an array that can contain millions of oligonucleotides included on chips the size of a microscope slide, achieving very high resolution. Comparative genome hybridization techniques are used in prenatal screening for the detection of chromosomal defects. However, they do not provide information about balanced changes, such as inversions or balanced translocations, because they do not change the copy number and hybridization intensity. To circumvent this issue, if the gene of interest is known, the respective recombinant DNA can be labeled and used as a probe on chromosome spreads. This method, called fluorescence in situ hybridization (FISH), can detect gene amplifications, deletions, and chromosomal translocations.
DNA sequencing is the ability to identify individual genes by determining the precise order of the four nucleotides (adenine, guanine, cytosine, and thymine) within a molecule of DNA. The rapid speed of sequencing attained with modern DNA sequencing technology has been instrumental in the sequencing of the complete human genome, which culminated in 2003. Over the years different techniques have been used for DNA sequencing. Initially the Maxam-Gilbert sequencing method used chemicals to cleave specific bases. This methodology was quickly discouraged due to its complexity and the use of radioactive labeling. In parallel, Sanger used small concentrations of radio- or fluorescently labeled dideoxynucleoside triphosphate (dNTP) molecules and developed a relatively reliable and less cumbersome method called chain termination. Sanger’s method was soon automated and was the method used in the first generation of DNA sequencers. The rapid development of novel technologies, such as NGS, has revolutionized this field. There are several NGS methods using different approaches to read DNA sequences. All these novel methods share the principle of conducting millions to billions of parallel sequencing reactions in microscopic compartments on arrays or nanobeads. Among others, DNA sequencing is used for genome-wide association studies (GWAS) using single nucleotide polymorphisms (SNPs) as high-resolution markers. SNPs are variations among individuals at a single position in a DNA sequence. If more than 1% of a population does not carry the same nucleotide at a specific position in the DNA sequence, then this variation can be classified as an SNP. SNPs are associated with both genes and noncoding regions of DNA and represent the most common type of genetic variation among individuals. In 2012, more than 180 million SNPs were known. SNPs are useful to identify and assess disease risk, but it is not uncommon that they are found to have no impact on the phenotype (silent mutations). Silent mutations, along with the need for extremely large cohorts of patients and for reproducibility of the studies, are among the main limitations of GWAS.
Genomics in Infectious Diseases
By 1950, using malaria as a prototype disease, the concept that genetic diversity within the host may influence the outcome of infection became apparent. In the clinical setting, the majority of infectious diseases are characterized by variation in both the disease pattern and severity, even during epidemics, thus highlighting the important role of host response on clinical manifestations and disease outcomes. Different inherited conditions, such as chronic granulomatous disease or interferon-γ receptor immunodeficiency, predispose to infectious diseases, and those conditions will be reviewed in another chapter. A number of SNPs in human leukocyte antigen (HLA) and non–major histocompatibility complex (MHC) have been found in response to a variety of bacterial and viral infections and have been associated with disease susceptibility, progression, or response to treatment. Table 3.1 illustrates examples of SNPs associated with a variety of uncommon and common viral or bacterial infections. *
* References .
The main limitation of GWAS is that the same SNP can be protective or influence disease progression depending on the population and on the environment, making reproducibility of studies challenging.Pathogen | Disease | SNPs/Mutations | Populations |
---|---|---|---|
Mycobacterium leprae | Leprosy | HLA-DR-DQ, TLR-1, NOD2, TNFSF15, 308 bp TNF, RIPK2, IL-23R, RAB32, LRRKW | Adults |
Mycobacterium tuberculosis | Tuberculosis | Mal/TIRAP, TLR-1, -2, -4, -6, -9, TNF-α, IFN-γ, IL-12RB1 | Adults |
Streptococcus pneumoniae | Pneumococcal disease | MBL2, PTPN22, Mal/TIRAP | Children |
Staphylococcus aureus | S. aureus infection | HLA-DRA, -DRB1 | Adults |
Neisseria meningitidis | Meningococcal disease | CFH-CFHR3, TLR-4 | Children and adults |
Helicobacter pylori | Gastric cancer | IL-1, EPHX1 | Adults |
HIV | AIDS | CCR5, CCR2, RANTES, CXCL12 | Children and adults |
Norovirus, rotavirus | Gastroenteritis | FUT2 | Children |
HCV | Hepatitis | IL-28B (INF-λ), IL-10R, IP-10 | Adults |
Dengue virus | Dengue shock syndrome | MICB, PLCE1 | Children and adults |
HSV | Encephalitis | TLR-3, UNC-93B | Children |
RSV | Bronchiolitis, severe disease | SP-A, SP-D, TLR-4, IL-8, IL-4, IL-13, IL-10, IL-1RL1, VDR | Children |
Influenza | Infection/severe disease | TNF, IL-6, IL-8, LTA, IL-1B, IL-1A, IL-10 | Adults |
Rhinovirus | Severe bronchiolitis | IL-10, IL-6, IFN-γ, | Children |
Epigenetics
In 1957, Waddington developed the idea that some heritable traits are not reflected by changes in the DNA, and this change process is now known as epigenetics. Epigenetics describe a number of chromatin modifications (phenotypic trait variations) that are caused by external or environmental factors and stably alter gene expression without changing the sequence of the DNA. Thus epigenetics is able to alter the phenotype of a cell without changing the genotype and supports the idea that changes in gene expression derived from long-term exposure to a certain insult are imprinted, become independent of the activating stimulus, and persist even in its absence. Epigenetics include the study of DNA methylation as well as a variety of more transient histone modifications (such as acetylation, methylation, or phosphorylation) along with the influence of SUMOylation (the addition of small ubiquitin-like modifiers [SUMOs]), ubiquitination, adenosine diphosphate (ADP) ribosylation, and microRNA. Although the epigenome is more variable than the genome, it may hold greater information on an individual basis, which will be useful for the application of personalized medicine.
Basics of the Epigenetics Approach
The methodology used to identify changes in DNA methylation are similar to that applied in genomics and include DNA sequencing of the treated versus untreated DNA, hybridization techniques, or array-based methods. These techniques may miss some incomplete modifications; however, new NGS methods that are able to detect DNA methylation directly look promising and will accelerate the field. The main epigenetic changes include histone modifications (such as acetylation and methylation that affect chromatin structure) and DNA methylation. It is important to note that although DNA methylation silences gene expression, histone modifications can enhance or suppress gene transcription. DNA methylation patterns have been associated with diseases and can be heritable by a poorly understood process called genomic imprinting. In addition, epigenetics includes the understanding of noncoding RNAs (ncRNAs), which are transcribed molecules that do not translate into proteins. In regard to epigenetic modifications, one of the long ncRNAs more recently discovered are small ncRNAs or microRNAs (miRs). miRNAs are highly conserved, small noncoding RNAs that target mRNA molecules and inhibit their translation. miRNA exist intra- and extracellularly, including in blood or serum, and are resistant to boiling or repeated freezing-thawing, thus promising to be useful biomarkers in the clinical setting.
Epigenetics in Infectious Diseases
There is growing evidence that histone modifications and chromatin remodeling regulate gene expression, including host immune responses, and thus represent key targets for pathogen manipulation during infection. A variety of viral and bacterial effectors have been identified that enable a pathogen’s survival by either mimicking or inhibiting host cellular machinery. Mitogen-activated protein kinase (MAPK), interferon (IFN), or transcription factor NF-κB signaling pathways, among others, are common targets of pathogen-induced posttranslational modifications on histones and chromatin-associated proteins.
In Vitro Studies
The majority of the initial epigenetics work associated with miRNA in relation to infections was performed in vitro and mostly included viral-induced diseases. The goal of these studies was to gain a better understanding of the mechanisms of diseases or to identify markers associated with organ-specific syndromes and/or severity associated with specific pathogens. Using a laryngeal epithelial cell model of enterovirus 71, Cui and colleagues identified 64 miRNAs that target a number of genes associated with neurologic and immune responses relevant to the pathogenesis of the disease. Similarly, using primary human alveolar and bronchial epithelial cells infected with influenza A virus, Buggele and colleagues identified six miRNAs targeting a number of mRNAs, including receptor-associated kinase 1 (IRAK1) or MAPK3 and other components of the innate immune response to infection. Another provocative study showed that influenza H3N2 uses one of its nonstructural (NS) proteins to circumvent immune responses. Investigators showed that the influenza NS1 protein had a histone-like sequence (ARTK-histone mimic) that inhibited the host transcription elongation factor (hPAF1), selectively suppressing the host cell’s production of antiviral proteins. Within alveolar macrophages, Pennini and colleagues showed that Mycobacterium tuberculosis (TB) inhibited the expression of several interferon-γ–induced genes through histone acetylation, which appears to be a ubiquitous mechanism used by intracellular pathogens. This mechanism may help explain the protracted course and persistence of TB in some patients. These studies, although descriptive, are examples of how pathogens can directly induce epigenetic changes in the host. As a diagnostic tool, a study conducted in mice infected with either Escherichia coli , a gram-negative bacterium, or S. aureus , a gram-positive bacterium, identified three circulating miRNAs predictive of gram-positive bacterial infections.
In Vivo Studies
miRNA expression is a novel addition to the -omics arsenal to evaluate host responses to viral and bacterial infections. A small study in patients with dengue identified in blood samples two panels of miRNAs, one composed of 12 miRNAs and specific to dengue and another formed of 14 miRNAs common in dengue and influenza infection. Another study was conducted with serum samples from children with hand-foot-and-mouth disease caused by EV-71 or Coxsackie virus 16 and included healthy controls and other infections including TB, pertussis, varicella zoster (VZV), or mumps. Investigators found six miRNAs that discriminated children with EV-71 infection versus healthy controls with greater than 90% accuracy. However, only 2 of the 6 miRNAs identified were also found in their in vitro model. This emphasizes the need to perform studies in target populations because several factors, such as disease severity, age, or other parameters, may provide discordant results. A small study conducted in infants with respiratory syncytial virus (RSV) identified a distinct profile of immune-associated miRNA in respiratory samples from patients with mild or severe disease. Although promising, these results will need to be validated in larger patient populations. Different studies have measured miRNA profiles in whole blood, peripheral blood mononuclear cells (PBMCs), or serum from patients infected with various strains of influenza A, including H1N1/H3N2, H7N9, or 2009 H1N1. However, results could not be validated in other studies, suggesting that there may be methodologic differences that must be addressed. In regard to bacterial pathogens, the majority of epigenetic studies have focused on TB or sepsis. A study that analyzed the sputum of patients with active TB versus controls found 95 miRNAs that were differentially expressed and were subsequently validated by quantitative reverse transcription PCR (RT-qPCR). Other studies conducted in adult and pediatric patients, with and without HIV, found a number of whole blood miRNAs in CD4 + T cells or sera that were differentially expressed in active versus latent TB or controls and had a specific role in T-cell immunity against TB. In regard to sepsis, serum miRNAs have been used mostly as a prognostic rather than diagnostic marker in critically ill patients with negligible or no overlapping findings among studies, which may be attributed to the different levels of disease and/or the patient populations included. Studies using miRNAs as a diagnostic tool for more common bacteria are ongoing. Nevertheless the value of miRNAs for the diagnosis of bacterial and viral infections, especially in patients with pneumonia, will be evident when studies are conducted and validated in the main target populations.
Transcriptomics
Major technological breakthroughs have also occurred in the field of transcriptomics, thus creating a unique opportunity for the study of humans in health and disease where inherent heterogeneity dictates that large collections of samples need to be analyzed. Among the high-throughput molecular profiling technologies available today, transcriptomic approaches are the most scalable, have the most breadth and robustness, and therefore appear to be best suited for the study of human populations. The transcriptome represents the complement of RNAs (messenger RNA) that are transcribed from the genome. Genes encode the information to make proteins, and RNA is the messenger that transports that information (hence the name mRNA). The transcription of a gene yields an average of 4 to 6 mRNA variants, which are translated into different proteins. The translation of mRNAs into proteins is highly regulated. Protein-coding genes only constitute 1% to 2% of the human genome sequence; however, more than 80% of the genome can be transcribed. Thus the largest part of the transcriptome consists of noncoding RNAs that fulfill important structural and regulatory functions, including gene transcription, mRNA processing and stability, and protein translation. One type of noncoding RNAs is the so-called small interfering (si)RNAs, which were discovered in 2006. siRNAs are part of an enzyme complex that targets and cleaves mRNAs with high specificity. These types of RNAs have become a powerful tool to downregulate (or silence) the expression of selected mRNAs with high specificity and efficiency.
Different classes of pathogens trigger specific pattern-recognition receptors (PRRs) differentially expressed on peripheral blood leukocytes. Blood represents both a reservoir and a migration compartment for these immune cells that become educated and implement their function by circulating between central and peripheral lymphoid organs and migrating to and from the site of infection via blood. Therefore blood leukocytes constitute an accessible source of clinically relevant information, and a comprehensive molecular phenotype of these cells can be obtained using gene expression microarrays. Because they provide a comprehensive assessment of all immune-related cells and pathways, genomic studies are well suited to study the host-pathogen interaction ( Fig. 3.4 ). In fact, studies in children and adults with acute infections have shown that different classes of pathogens induce distinct gene expression profiles that can be identified by analyses of blood leukocytes ( Fig. 3.5 .)
Basics of the Transcriptomics Approach
Microarray Analyses
Microarray methods for studying global mRNA expression profiles are now well established. Microarray gene chips contain several million DNA spots arranged on a small slide in a predefined order. They allow a relative quantitation of changes in transcript abundance among different conditions. Modern arrays use unique sequences of synthetic oligonucleotides to avoid ambiguity in identifying specific RNA transcripts, and several oligonucleotides are used per gene to improve accuracy. The newer high-density arrays are able to scan the transcription of all human genesm map exon content, and splice variants of mRNAs, including noncoding RNAs (such as siRNAs and miRNAs). Briefly, RNA derived from the cells or tissue analyzed is copied first to complementary cDNA using a reverse transcriptase that can synthesize DNA from RNA templates. cDNA is transcribed back into cRNA, which is labeled with fluorescent tags, such as biotin, to improve detection. cRNA is preferred to cDNA because it hybridizes more strongly to the array oligonucleotides. After hybridization, the microarray chip is scanned and the hybridization intensities are compared using different statistical software. Thanks to a common convention for reporting microarray experiments called Minimal Information for the Annotation of Microarray Experiments (MIAME), gene array databases that have been published are publicly available free of charge and represent a valuable source for further analysis in which microarray results from different experiments can be compared. Gene array analysis is already being used for clinical applications.
RNA Seq
Transcriptome analysis can also be performed by direct sequencing once RNAs have been converted to cDNAs. The advances in rapid and cheap DNA sequencing methods permit every transcript to be sequenced multiple times. These “deep sequencing” methods not only unambiguously identify the transcripts and splice forms but also allow the direct counting of transcripts over the whole dynamic range of RNA expression, resulting in absolute transcript numbers rather than relative comparisons. Thus the sequencing methods, called RNA seq, are quickly becoming attractive alternatives to array-based transcriptomic methods. Studies using RNA seq to characterize different infections are ongoing. As examples, two studies in mice, one in a model of Staphylococcus aureus infection and the second one in a model of H5N8 avian influenza virus, found transcripts associated with proinflammatory and antiinflammatory mediators, chemotaxis, cell signaling, keratins, and TH1/TH17 cytokines.
Use of Transcriptomics in Infectious Diseases
Of all the -omics technologies, transcriptomics is probably the most popular, affordable, and easiest to implement approach because it allows measuring transcript abundance in a sample on a genome-wide scale using a single assay. Several studies have been conducted over the years, initially in vitro and subsequently in samples (usually PBMCs or whole blood) from patients with a variety of infectious diseases.
In Vitro Studies
The initial studies supporting the hypothesis that pathogen-specific gene expression profiles can be measured in immune cells were derived from in vitro studies. The comparative analysis of a compendium of host-pathogen microarray datasets identified both a common host transcriptional response to infection and a pathogen-specific signature. Upon activation, Toll-like receptors (TLRs) trigger signaling pathways that share common components while retaining unique characteristics, accounting in part for the specificity of transcriptional responses. In fact, in vitro microarray studies have shown the ability of herpes simplex virus (HSV), West Nile virus, pseudorabies virus, hepatitis C viruses (HCV), VZV, and rhinovirus to limit the ability of the host to develop effective antiviral responses by a variety of mechanisms. However, the vast body of in vitro experimental data accumulated over the years suggests that hosts can mount pathogen-specific transcriptional responses to infections.
In Vivo Human Studies
Initial studies tested the hypothesis that leukocytes isolated from peripheral blood of patients with acute infections carry unique transcriptional signatures, which would in turn permit pathogen discrimination and classification. In those initial studies, gene expression patterns derived from PBMCs of pediatric patients hospitalized with acute infections showed that there are pathogen-specific signatures that can be measured in the blood, and these distinguished children with influenza A from S. aureus, Streptococcus pneumoniae, and E. coli acute infections with greater than 95% accuracy. Analysis of PBMC samples requires processing in real time, which has limitations from a practical clinical application if there are large numbers of patients. In addition, PBMC samples do not include neutrophils, which is a relevant cell population for the pathogenesis of bacterial and viral infections. For these reasons, in recent years, there has been a shift toward whole blood samples to study transcriptional profiles in the clinical setting. Indeed whole blood signatures for several other infections have also been described from infected subjects including malaria, dengue, salmonella, melioidosis, TB, RSV, influenzavirus (including the pandemic H1N1/09), rhinovirus, adenovirus, human T-cell lymphotropic virus (HTLV-1), HIV, and neonatal sepsis ( Table 3.2 ). *
* References .
Country/Study Year | Pathogens | Population | Sample Type | Cohorts/Validation |
---|---|---|---|---|
US, 2007 | Virus vs. bacterial a | Children <18 yr ( n = 131) Ctrl age matched ( n = 7) | PBMCs | 3 patient cohorts and RT-PCR |
Vietnam, 2009 | Salmonella typhi | Adults ( n = 29) Ctrl (OD, n = 10; HC, n = 16) | Whole blood | — |
Thailand, 2009 | Burkholderia pseudomallei | Adults ( n = 32) Ctrl (OD, n = 31; HC, n = 29) | Whole blood | 3 patient cohorts |
Cambodia, 2010 | Dengue | Children <15 yr ( n = 48) DSS ( n = 19) DF ( n = 16) DHF ( n = 13) | Whole blood | RT-PCR validation |
UK, South Africa, 2010 | Mycobacterium tuberculosis (TB) | Adults TB ( n = 123) Ctrl (OD = 96) HC ( n = 24) | Whole blood | 3 patient cohorts |
Switzerland, 2010 | HIV | Adults ( n = 137) Ctrl ( n = 19) | CD4 + T-cells | Genomewide SNP |
West Africa, 2012 | Plasmodium falciparum | Children <10 yr ( n = 94) Ctrl age matched ( n = 61) | Whole blood | Mouse model |
US, 2012 | S. aureus (invasive) | Children ( n = 99) Ctrl ( n = 44) | Whole blood | 2 patient cohorts |
UK, 2012 | H1N1 influenza A | Adults ( n = 11) Ctrl (OD, n = 28; HC, n = 18) | Whole blood | 2 patient cohorts Public available data sets |
US, 2012 | RSV, influenza | Children <2 yr ( n = 79) Ctrl, age matched ( n = 22) | PBMCs | Mouse model, primary human epithelial cells |
US, Finland, 2013 | RSV, influenza, HRV | Children <2 yr ( n = 181) Ctrl age matched ( n = 39) | Whole blood | 4 patient cohorts |
UK, 2013 | H1N1/09 influenza, RSV, bacteria | Children <8 yr ( n = 77) Ctrl children ( n = 33) | Whole blood | Public available data set |
US, 2013 | Virus vs. bacterial b | Febrile children <3 yr ( n = 30) Afebrile children ( n = 22) | Whole blood | Public available data set |
South Africa Malawi, Kenya, UK, 2014 | TB ± HIV | Children TB ± HIV ( n = 193) OD ( n = 239) LTBI ( n = 71) | Whole blood | 3 patient cohorts |
US, Australia, 2013 | Virus vs. bacterial c | Experimental infection ( n = 41) Febrile adults ( n = 102) Ctrl ( n = 35) | Whole blood | 3 patient cohorts |
UK, Australia, 2014 | Neonatal sepsis | Infants, confirmed inf ( n = 43) Infants, suspected inf ( n = 30) Controls ( n = 45) | Whole blood | 3 patient cohorts, two platforms |
Scotland, Ireland, 2014 | Neonatal sepsis | Neonates, infected ( n = 46) Neonates, controls ( n = 69) | Whole blood | 3 patient cohorts, two platforms |
USA, Finland, Spain, 2016 | Rhinovirus | Children RV + ( n = 114) Controls ( n = 37) | Whole blood | 3 patient cohorts and RT-PCR |
a Virus, influenza A; bacteria, S. pneumoniae , S. aureus , E. coli .
b Virus, human herpesevirus-6, adenovirus, and enterovirus; bacteria, S. aureus , E. coli , Salmonella .
c Virus, influenza A, human rotavirus; bacteria, S. aureus , S. pneumoniae , E. coli .
A study performed in adult volunteers experimentally infected with RSV, rhinovirus, or influenza A identified an “acute respiratory viral signature” that was independently validated using a previously published dataset of pediatric patients with pneumonia. Despite the technical challenges involved in such analysis and the differences in the patient populations analyzed (children naturally infected vs. adults with experimental infection), the identified “viral signature” classified pediatric patients with influenza from age-matched healthy controls with 100% accuracy. This is a critical observation that confirms the reproducibility and potential value of blood transcriptome analysis to study host immune responses to respiratory viruses in the clinical setting. Additional studies will be necessary to evaluate this approach in other relevant clinical situations where the application of this methodology has the potential to transform the standard of care. In this regard, two studies have already shown the utility of host gene expression profiles as a diagnostic tool when effective treatment depends on rapid identification of the infectious agent or even the need for treatment. In the first study, also using adult volunteers experimentally infected with influenza A H1N1 or H3N2, the authors found a blood RNA signature that was detectable more than 24 hours before the peak of clinical symptoms. Subsequently, the same group of investigators used the transcriptome profiles derived from the experimental influenza signature to develop a targeted host-based RT-PCR low-density array assay. This assay was applied to adult patients presenting with fever to the emergency department (ED) and differentiated viral versus bacterial infections with 89% sensitivity and 94% specificity, demonstrating that gene expression profiles identified by microarray analyses can be successfully applied to custom-made platforms with the potential for a fast, point-of-care patient diagnosis and classification. It is remarkable that although in the majority of studies patient samples were collected at different time points after pathogen exposure and disease onset, robust and pathogen-specific biosignatures have been derived and validated in independent cohorts of patients in completely different settings.
Areas for Improved Diagnosis in Pediatrics
Lower Respiratory Tract Infections (LRTI)/Pneumonia
Acute LRTI/community-acquired pneumonia (CAP) represent the leading cause of hospitalization in the United States and is the main cause of death worldwide in children less than 5 years of age. In industrialized countries, CAP has an annual incidence of 36 to 40 per 1000 children below the age of 5 years and 11 to 16 per 1000 in children 5 to 14 years of age. In the United States it is second only to injuries as the most common reason for hospitalization in children less than 18 years of age. In current clinical practice, establishing the precise etiologic diagnosis of pneumonia, or even simply discriminating viral from bacterial respiratory infections, remains challenging. Unfortunately the pressure to achieve a rapid resolution of symptoms has commonly led medical practitioners to take an overcautious approach and treat many patients unnecessarily with antibiotics. Recent studies have provided an initial proof of concept of how application of blood gene expression profiles could represent an alternative approach for the diagnosis of viral and bacterial LRTI. Two landmark studies published in 2013 have shown the potential of transcriptome analysis for diagnosis and patient classification in two completely different clinical situations in which traditional tools have been demonstrated as insufficient: TB and RSV infections.
TB remains a major diagnostic challenge, especially in the developing world where the prevalence of HIV is high. A study conducted in Africa showed the value of whole blood transcriptome analysis for the diagnosis of TB in a large cohort of HIV-infected and uninfected patients. From the expression data, the authors developed a disease risk score that discriminated with high sensitivity and specificity (>90%) between patients with active TB and those with an alternative diagnosis but who presented initially with suspected TB and even from patients with latent TB infection. In young children, respiratory viral infections and specifically RSV represent the most common cause of LRTI leading to hospitalization worldwide. In the clinical setting it is impossible to predict, based on the physical examination and available diagnostic tools, which patients with RSV infection will progress to severe disease requiring hospitalization and which patients can be discharged home safely. Hence there is a clear need to better understand the immune response to RSV and how it relates to disease pathogenesis, progression, and severity. A recent study conducted in the United States analyzed a cohort of 220 children younger than 2 years of age who were hospitalized with acute RSV, rhinovirus, and influenza A LRTI and showed that blood RNA profiles differentiated these three viral infections with 95% accuracy. In addition, and as previously suggested, RSV infection induced overexpression of interferon and neutrophil genes and suppression of B- and T-cell genes, which persisted beyond the acute disease and was greatly impaired in infants of less than 6 months of age. These results may explain in part the lack of protective antibody responses observed after acute RSV infection. Moreover, the authors identified a genomic score that significantly correlated with outcomes of care ( Fig. 3.6 ). Altogether these studies demonstrate that large amount of microarray data can be translated into a biologically meaningful context that can be correlated with disease severity and applied in the relevant clinical setting to accurately classify patients. It is remarkable that blood signatures can achieve such accuracy for the diagnosis of respiratory pathogens that are thought to be mostly confined to the respiratory tract. From a practical perspective, in the majority of the clinical situations, obtaining a blood sample is more feasible than obtaining infected tissue.