Abstract
Gametogenesis, embryo development, implantation and in-vitro culture involve numerous complex pathways and interactions at the cellular and molecular level; a true understanding of their significance requires fundamental knowledge of the underlying principles. This chapter therefore provides a condensed overview and review of basic terminology and definitions, with particular emphasis on aspects relevant to reproductive biology and in-vitro fertilization.
Gametogenesis, embryo development, implantation and in-vitro culture involve numerous complex pathways and interactions at the cellular and molecular level; a true understanding of their significance requires fundamental knowledge of the underlying principles. This chapter therefore provides a condensed overview and review of basic terminology and definitions, with particular emphasis on aspects relevant to reproductive biology and in-vitro fertilization.
Mammalian Cell Biology
In 1839, two German scientists, Matthias Jakob Schleiden and Theodor Schwann, introduced the ‘cell theory,’ the proposal that all higher organisms are made up of a single fundamental unit as a building block. In 1855, Rudolf Virchow extended this cell theory with a suggestion that was highly controversial at the time: ‘Omnis cellulae e celula’ (all living cells arise from pre-existing cells). This statement has become known as the ‘biogenic law.’ The cell theory is now accepted to include a number of principles:
1. All known living things are made up of cells.
2. The cell is the structural and functional unit of all living things.
3. All cells come from pre-existing cells by division (spontaneous generation does not occur).
4. Cells contain hereditary information that is transmitted from cell to cell during cell division.
5. The chemical composition of all cells is basically the same.
6. The energy flow (metabolism and biochemistry) of life occurs within cells.
Although these features are common to all cells, the expression and repression of genes dictate individual variation, resulting in a large number of different types of variegated but highly organized cells, with convoluted intracellular structures and interconnected elements. The average size of a somatic cell is around 20 µm; the oocyte is the largest cell in the body, with a diameter of approximately 120 µm in its final stages of growth (Figure 1.1). The basic elements and organelles in an individual cell vary in distribution and number according to the cell type. Bacterial cells differ from mammalian cells in that they have no distinct nucleus, mitochondria or endoplasmic reticulum. Their cell membrane has numerous attachments, and their ribosomes are scattered throughout the cytoplasm.
Figure 1.1 Schematic diagram of oocyte ultrastructure showing the zona pellucida (ZP) and the perivitelline space (PVS), first polar body (PB1), microvilli (MV), rough endoplasmic reticulum (rER), chromosomes (Ch) on the spindle (SP), Golgi complex (G), cortical granules (CG), two follicle cells (FC) attached to the oocyte and to each other via gap junctions (GJ). TZP = transzonal process, MT = microtubules, M = mitochondria, N = nucleus.
Cell membranes are made up of a bimolecular layer of polar lipids, coated on both sides with protein films. Some proteins are buried in the matrix, others float independently of each other in or on the membrane surface, forming a fluid mosaic of different functional units that are highly selective and specialized in different cells. Cells contain many different types of membrane, and each one encloses a space that defines an organelle, or a part of an organelle. The function of each organelle is determined largely by the types of protein in the membranes and the contents of the enclosed space. Membranes are important in the control of selective permeability, active and passive transport of ions and nutrients, contractile properties of the cell, and recognition of/association with other cells.
Cellular membranes always arise from pre-existing membranes, and the process of assembling new membranes is carried out by the endoplasmic reticulum (ER, see below). The synthesis and metabolism of fatty acids and cholesterol is important in membrane composition, and fatty acid oxidation (e.g., by the action of reactive oxygen species, ROS) can cause the membranes to lose their fluidity, as well as have an effect on transport mechanisms.
Microvilli are extensions of the plasma membrane that increase the cell surface area; they are abundant in cells with a highly absorptive capacity, such as the brush border of the intestinal lumen. Microvilli are present on the surface of oocytes, zygotes and early cleavage stage embryos in many species, and in some species (but not humans) their distribution is thought to be important in determining the site of sperm entry.
Cell cytoplasm is a fluid space, containing water, ions, enzymes, nutrients and macromolecules; the cytoplasm is permeated by the cell’s architectural support, the cytoskeleton.
Microtubules are hollow polymer tubes made up of alpha–beta dimers of the protein tubulin. They are part of the cytoskeletal structure and are involved in intracellular transport, for example, the movement of mitochondria. Specialized structures such as centrioles, basal bodies, cilia and flagella are made up of microtubules. During prophase of mitosis or meiosis, microtubules form the spindle for chromosome attachment and movement.
Microfilaments are threads of actin protein, usually found in bundles just beneath the cell surface; they play a role in cell motility, ionic regulation and in endo- and exocytosis.
Centrioles are a pair of hollow tubes at right angles to each other, just outside the nucleus. These structures organize the nuclear spindle in preparation for the separation of chromatids during nuclear division. When the cell is about to divide by mitosis, one of the centrioles migrates to the antipode of the nucleus so that one lies at each end. The microtubule fibers in the spindle are contractile, and they pull the chromosomes apart during cell division.
The nucleus of each cell is surrounded by a layered membrane, with a thickness of 7.5 nm. The outer layer of this membrane is connected to the ER, and the outer and inner layers are connected by ‘press studs,’ creating pores in the nuclear membrane that allow the passage of ions, RNA and other macromolecules between the nucleus and the cell cytoplasm. These pores have an active role in the regulation of DNA synthesis, since they control the passage of DNA precursors and thus allow only a single duplication of the pre-existing DNA during each cell cycle. The inner surface of the membrane has nuclear lamina, a regular network of three proteins that separate the membrane from peripheral chromatin. DNA is distributed throughout the nucleoplasm wound around spherical clusters of histones to form nucleosomes, which are strung along the DNA like beads. These are then further aggregated into the chromatin fibers of approximately 30 nm diameter. The nucleosomes are supercoiled within the fibers in a cylindrical or solenoidal structure to form chromatin, and the nuclear lamina provide anchoring points for chromosomes during interphase (Figure 1.2):
Active chromatin = euchromatin – less condensed
Inactive (turned off) = heterochromatin – more condensed
Before and during cell division, chromatin becomes organized into chromosomes.
Three types of cell lose their nuclei as part of normal differentiation, and their nuclear contents are broken down and recycled:
Red blood cells (RBCs)
Squamous epithelial cells
Platelets.
Other cells may be multinuclear: syncytia in muscle and giant cells (macrophages), syncytiotrophoblast.
Nuclear RNA is concentrated in nucleoli, which form dense, spherical particles within the nucleoplasm (Figure 1.3); these are the sites where ribosome subunits, ribosomal RNA and transfer RNA are manufactured. RNA polymerase I rapidly transcribes the genes for ribosomal RNA from large loops of DNA, and the product is packed in situ with ribosomal proteins to generate new ribosomes (RNP: ribonucleoprotein particles).
Mitochondria are the site of aerobic respiration. Each cell contains 40–1000 mitochondria, and they are most abundant in cells that are physically and metabolically active. They are elliptical, 0.5–1 µm in size, with a smooth outer membrane, an intermembranous space, and a highly organized inner membrane that forms cristae (crests) with elementary particles attached to them, ‘F1-F0 lollipops,’ which act as molecular dynamos. The cristae are packed with proteins, some in large complexes: the more active the tissue, the more cristae in the mitochondria. Cristae are the site of intracellular energy production and transduction, via the Krebs (TCA) cycle, as well as processes of oxidation, dehydrogenation, fatty acid oxidation, peroxidation, electron transport chains and oxidative phosphorylation.
They also act as a Ca2+ store and are important in calcium regulation. Mitochondria contain their own double-stranded DNA that can replicate independently of the cell, but the information for their assembly is coded for by nuclear genes that direct the synthesis of mitochondrial constituents in the cytoplasm. These are transported into the mitochondria for integration into its structures.
A number of rare diseases are caused by mutations in mitochondrial DNA, and the tissues primarily affected are those that most rely on respiration, i.e., the brain and nervous system, muscles, kidneys and the liver. All the mitochondria in the developing human embryo come from the oocyte, and therefore all mitochondrial diseases are maternally inherited, transmitted exclusively from mother to child. In the sperm, mitochondria are located in the midpiece, providing the metabolic energy required for motility; there are no mitochondria in the sperm head.
Oocytes contain 100 000–1 000 000 mitochondria.
Sperm contain 70–100 mitochondria, in the midpiece of each sperm. These are incorporated into the oocyte cytoplasm, but do not contribute to the zygote mitochondrial population – they are eliminated at the four- to eight-cell stage.
All of the mitochondria of an individual are descendants of the mitochondria of the zygote, which contains mainly oocyte mitochondria, i.e. mitochondria are maternal in origin.
Paternal mtDNA has been identified in a few exceptional cases, transmitted with an unusual autosomal dominant-like inheritance (Luo et al., 2018).
The sequence of human mitochondrial DNA was published by Fred Sanger in 1981, who shared the 1980 Nobel Prize in Chemistry with Paul Berg and Walter Gilbert, ‘for their contributions concerning the determination of base sequences in nucleic acids.’ The mitochondrial genome has:
Small double-stranded circular DNA molecule (mtDNA) 16 568 bp in length.
37 genes that code for:
2 ribosomal RNAs
22–23 tRNAs
10–13 proteins associated with the inner mitochondrial membrane, involved in energy production.
Other mitochondrial proteins are encoded by nuclear DNA and specifically transported to the mitochondria.
Mitochondrial DNA is much less tightly packed and protected than nuclear DNA and is therefore more susceptible to ROS damage that can cause mutations.
As it is inherited only through the maternal line, mutations can be clearly followed through generations and are used as ‘markers’ in forensic science and archaeology, as well as in tracking different human populations and ethnic groups.
Mitochondria can be seen in different distributions during early development (Figure 1.4); they do not begin to replicate until the blastocyst stage, and therefore an adequate store of active mitochondria in the mature oocyte is a prerequisite for early development.
Germinal vesicle oocyte: homogeneous clusters associated with ER
Metaphase I oocyte: polarized toward the spindle
Metaphase II oocyte: perinuclear ring and polar body
Embryos at 1c, 2c, 4c stages: perinuclear ring
Cytoplasmic fragments in cleavage stage embryos contain large amounts of active mitochondria
The endoplasmic reticulum (ER) is an interconnected lipoprotein membrane network of tubules, vesicles and flattened sacs that extends from the nuclear membrane outwards to the plasma membrane, held together by the cytoskeleton. The ER itself is a membrane-enclosed organelle that carries out complex biosynthetic processes, producing proteins, lipids and polysaccharides. As new lipids and proteins are made, they are inserted into the existing ER membrane and the space enclosed by it. Smooth ER (sER) is involved in metabolic processes, including synthesis and metabolism of lipids, steroids and carbohydrates, as well as regulation of calcium levels. The surface of rough ER (rER) is studded with ribosomes, the units of protein synthesis machinery. Membrane-bound vesicles shuttle proteins between the rER and the Golgi apparatus, another part of the membrane system. The Golgi apparatus is important in modifying, sorting and packaging macromolecules for secretion from the cell; it is also involved in transporting lipids around the cell, and in making lysosomes.
Figure 1.4 Mitochondrial aggregation patterns in a germinal vesicle (GV) oocyte (top), an MI oocyte (center) and an MII oocyte (bottom). Frames to the left are in fluorescence using the potential sensitive dye JC-1 to show the mitochondria; frames on the right are transmitted light images. The two mitochondrial patterns, A (granular-clumped) and B (smooth), are shown.
Rough Endoplasmic Reticulum (rER)
Has attached 80S ribonucleoprotein particles, the ribosomes (bacterial ribosomes are 70S), which are made in the nucleus and then travel out to the cytoplasm through nuclear pores.
Ribosomes are composed of two subunits: 40 s and 60 s (bacteria: 30S and 50S); the association between the subunits is controlled by Mg2+ concentration.
Polysomes are several ribosomes which move along a single strand of mRNA creating several copies of the same protein.
Smooth Endoplasmic Reticulum (sER)
A series of flattened sacs and sheets, site of lipid and steroid synthesis.
Cells that make large amounts of steroids have extensive sER.
The Golgi apparatus was first observed by Camillo Golgi in 1898, using a novel silver staining technique to observe cellular structures under the light microscope; he was awarded the 1906 Nobel Prize in Physiology or Medicine for his studies on the structure of the nervous system. The Golgi apparatus consists of a fine, compact network of tubules near the cell nucleus, a collection of closely associated compartments with stacked arrays of smooth sacs and variable numbers of cisternae, vesicles or vacuoles. It is connected to rER, linked to vacuoles that can develop into secretory granules, which contain and store the proteins produced by the rER. All of the proteins exported from the ER are funneled through the Golgi apparatus, and every protein passes in a strict sequence through each of the compartments (cis, medial, trans). This process consists of three stages:
1. ‘Misdirected mail’ – sends back misdirected proteins (cis).
2. ‘Addressing’ – (medial) stacks of cisternae that modify lipid and sugar moieties, giving them ‘tags’ for subsequent sorting.
3. ‘Sorting and delivering’ (trans): proteins and lipids are identified, sorted and sent to their proper destination.
Transport occurs via vesicles, which bud from one compartment and fuse with the next. The Golgi apparatus will move to different parts of the cell according to the ongoing metabolic processes at the time; it is very well developed in secretory cells (e.g., in the pancreas).
The Golgi apparatus also makes lysosomes, which contain hydrolytic enzymes that digest worn-out organelles and foreign particles, acting as ‘rubbish bins’ and providing a recycling apparatus for intracellular digestion; they contain at least 50 different enzymes, and ‘leaky’ lysosomes can damage and kill cells. Macromolecules inside the cell are transported to lysosomes, those from outside the cell reach them by pinocytosis or phagocytosis; phagocytosis only occurs in specialized cells (e.g., white blood cells).
Peroxisomes are microbody vesicles that contain oxidative enzymes such as catalase; they dispose of toxic hydrogen peroxide and are important in cell aging.
Fundamental Principles of Molecular Biology
Nucleic Acids
The nucleic acids, DNA (deoxyribose nucleic acid; Figure 1.5) and RNA (ribose nucleic acid), are made up of:
1. Nucleotides: organic compounds containing a nitrogenous base
2. Sugar: deoxyribose in DNA, ribose in RNA
3. Phosphate group.
Nucleotides are purines and pyrimidines, determined by the structure of the nitrogenous base.
DNA | RNA | |
---|---|---|
Purines | Adenine (A) | Adenine (A) |
(double ring) | Guanine (G) | Guanine (G) |
Pyrimidines | Cytosine (C) | Cytosine (C) |
(single ring) | Thymine (T = methylated U) | Uracil (U) |
Methylation of cytosine is important in gene silencing and imprinting processes.
Nucleotides also function as important cofactors in cell signaling and metabolism: coenzyme A (CoA), flavin adenine dinucleotide (FAD), flavin mononucleotide, adenosine triphosphate (ATP), nicotinamide adenine dinucleotide phosphate (NADP).
DNA
DNA Replication
DNA copies itself by semi-conservative replication: each strand acts as a template for synthesis of a complementary strand.
1. Free nucleotides are made in the cytoplasm and are present in the nucleoplasm before replication begins.
2. The double helix unwinds and hydrogen bonds, holding the two DNA strands together, break. This leaves unpaired bases exposed on each strand.
3. The sequence of unpaired bases serves as a template on which to arrange the free nucleotides from the nucleoplasm.
4. DNA polymerase moves along the unwound parts of the DNA, pairing complementary nucleotides from the nucleoplasm with each exposed base.
5. The same enzyme connects the nucleotides together to form a new strand of DNA, hydrogen bonded to the old strand:
DNA polymerase forms new hydrogen bonds on the 5′3′ strand.
DNA ligase acts on the 3′5′ strand.
Several replication points appear along the strand, which eventually join.
6. DNA is then mounted on ‘scaffolding proteins,’ histones – and this is then wrapped around non-histones to form chromatin. Histones are basic proteins that bind to nuclear DNA and package it into nucleosomes; the regulation of gene expression involves histone acetylation and deacetylation. There are two ATP-dependent remodeling complexes and acetyltransferases that preferentially bind activated states and fix chromatin configurations:
Histone acetyltransferase coactivator complex
Histone deacetylase corepressor complex.
Methylation of protamines and histones is a crucial component of imprinting processes: an association has been found between Beckwith–Wiedemann syndrome and epigenetic alterations of LitI and H19 during in-vitro fertilization (DeBaum et al., 2003).
Each mammalian cell contains around 1.8 m of DNA, of which only 10% is converted into specific proteins; the noncoding part of the DNA still carries genetic information and probably functions in regulatory control mechanisms.
In 1968, the Nobel Prize in Physiology or Medicine was awarded to Robert Holland, Ghobind Khorana and Marshall Nirenberg ‘for their interpretation of the genetic code and its function in protein synthesis’:
DNA transfers information to mRNA in the form of a code defined by a sequence of nucleotide bases.
The code is triplet, unpunctuated and nonoverlapping.
Three bases are required to specify each amino acid, there are no gaps between codons and codons do not overlap.
Since RNA is made up of four types of nucleotides (A, C, G, U), the number of triplet sequences (codons) that are possible is 4 × 4 × 4 = 64; three of these are ‘stop codons’ that signal the termination of a polypeptide chain.
The remaining 61 codons can specify 20 different amino acids, and more than one codon can specify the same amino acid. (Only methionine [Met] and tryptophan [Trp] are specified by a single codon).
Since the genetic code thus has more information than it needs, it is said to be ‘degenerate.’
A mutation in a single base can alter the coding for an amino acid, resulting in an error in protein synthesis: translated RNA will incorporate a different amino acid into the protein, which may then be defective in function. (Sickle cell anemia and phenylketonuria are examples of single-gene defects.)
A homeobox is a DNA sequence that codes for a 60-amino-acid protein domain known as the homeodomain.
A homeodomain acts as a switch that controls gene transcription.
Homeobox genes, first discovered in 1983, are a highly conserved family of transcription factors that switch on cascades of other genes:
Are involved in the regulation of embryonic development of virtually all multicellular animals, playing a crucial role from the earliest steps in embryogenesis to the latest stages of differentiation
Are arranged in clusters in the genome.
POU factors are a class of transcriptional regulators, required for high-affinity DNA binding, that are important in tissue-specific gene regulation; they are named after three proteins in the group: P it-l (also known as GHF-1), O ct-l and U nc-86.
RNA
Paired bases are G–C and A–U.
Pentose sugar = ribose.
Basic structure in mammalian cells is single-stranded, but most biologically active forms contain self-complementary sequences that allow parts of the RNA to fold and pair with itself to form double helices, creating a specific tertiary structure.
RNA molecules have a negative charge, and metal ions such as Mg2+ and Zn2+ are needed to stabilize many secondary and tertiary structures.
Hydroxyl groups on the deoxyribose ring make RNA less stable than DNA because it is more prone to hydrolysis.
There are many different types of RNA, each with a different function:
Transcription, translation/protein synthesis: mRNA, rRNA, tRNA
Post-transcriptional modification or DNA replication: small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), guide RNA (gRNA), ribonuclease P, ribonuclease MRP, etc.
Gene regulation: microRNA (miRNA), small interfering RNA (siRNA), etc.
microRNAs are increasingly recognized as ‘master regulators’ of gene expression, regulating large networks of genes by chopping up or inhibiting the expression of protein-coding transcripts.
Ribosomal RNA
Makes up 80% of total RNA.
Made in the nucleolus, then moved out into the nucleoplasm and then to the cytoplasm to be incorporated into ribosomes.
Transfer RNA (4S RNA)
Makes up 10–15% of total RNA.
Single strand, 75–90 nucleotides wound into a clover leaf shape; each tRNA molecule transfers an amino acid to a growing polypeptide chain during translation.
A three-base anticodon sequence on the ‘tail’ is complementary to a codon on mRNA; an amino acid is attached at the 3′ terminal site of the molecule, via a covalent link that is catalyzed by an aminoacyl tRNA synthetase. Each type of tRNA molecule can be attached to only one type of amino acid; however, multiple codons in DNA can specify the same amino acid, and therefore the same amino acid can be carried by tRNA molecules that have different three-base anticodons.
Methionyl tRNA has a critical function, required for the initiation of protein synthesis.
Messenger RNA
This makes up 3–5% of total cellular RNA (exception: sperm cells contain approximately 40% mRNA and very little rRNA)
The mRNA molecules are single-stranded, complementary to one strand of DNA (coding strand) and identical to the other. DNA is transcribed into mRNA molecules, which carry coding information to the ribosomes for translation into proteins.
Transcription
DNA information is transcribed to mRNA in the nucleus, starting from a promoter sequence on the DNA at the 5′ end, and finishing at the 3′ end.
All of the exons and introns in the DNA are transcribed; stop and start sequences are encoded in the gene.
The product, nuclear mRNA precursor (HnRNA, heterogeneous nuclear RNA) is processed into mature cytoplasmic mRNA by splicing at defined base pairs to remove the introns and join the exons together.
A cap of 7-methylG is added at the 5′ end.
A string of polyA is added at 3′ end = polyA tail, 50–300 residues.
The polyA tail adds stability to mRNA molecules, making them less susceptible to degradation, and also has a role in transporting mRNA from the nucleus to the cytoplasm.
A promoter is a specific DNA sequence that signals the site for RNA polymerase to initiate transcription; this process needs an orchestrated interaction between proteins binding to specific DNA sequences, as well as protein–protein interactions. DNA methylation is involved in the regulation of transcription. Gene sequences that lie 5′ to the promoter sequence bind specific proteins that influence the rate of transcription from a promoter:
A TATA box aligns RNA polymerase II with DNA by interacting with transcription initiation factors (TFs).
Proteins that bind to a CAAT box determine the rate at which transcription is initiated, bringing RNA polymerase II into the area of the start site in order to assemble the transcriptional machinery. The tertiary structure of the DNA (bends and folds) is important in making sure that all components are correctly aligned.
Enhancers, silencers and hormone response elements (steroid receptors) are important in determining the tissue-specific expression or physiological regulation of a gene; these factors respond to signals such as cAMP levels.
Transcription takes place during oocyte growth and stops before ovulation.
The mRNA turnover begins before ovulation: mRNA molecules must be protected from premature translation.
Oocytes contain mechanisms that remove histone H1 from condensed chromatin.
Differential acetylation profiles of core histone H4 and H3 for parental genomes during the first G1 phase may be important in establishing early zygotic ‘memory.’
The timing of transcriptional events during the first zygotic cell cycle will have an effect on further developmental potential.
Translation (Protein Synthesis)
During protein synthesis, ribosomes move along the mRNA molecule and ‘read’ its sequence three nucleotides at a time, from the 5′ end to the 3′ end. Each amino acid is specified by the mRNA’s codon, and then pairs with a sequence of three complementary nucleotides carried by a particular tRNA (anticodon). The translation of mRNA into polypeptide chains involves three phases: initiation, elongation and termination. Messenger RNA binds to the small (40 s) subunit of the ribosome on rER in the cytoplasm, and six bases at a time are exposed to the large (60 s) subunit. The endpoint is specified by a ‘stop’ codon: UAA, UAG or UGA.
1. Initiation: The first three bases (codon) are always AUG, and the initiation complex locates this codon at the 5′ end of the mRNA molecule. A methionyl-tRNA molecule with UAC on its coding site forms hydrogen bonds with AUG, and the complex associates with the small ribosome subunit (methionine is often removed after translation, so that not every protein has methionine as its first amino acid). Some mRNAs contain a supernumerary AUG and associated short coding region upstream and independent of the main AUG coding region; these upstream open reading frames (uORFs) can regulate the translation of the downstream gene.
The large ribosome subunit has a P site which binds to the growing peptide chain, and an A site which binds to the incoming aa-tRNA.
2. Elongation: the unbound tRNA may now leave the P site, and the ribosome moves along the mRNA by one codon.
A peptide bond is formed, and the aa-tRNA bond is hydrolyzed to release the free tRNA.
A second tRNA molecule, bringing another amino acid, bonds with the next three exposed bases. The two amino acids are held closely together, and peptidyl transferase in the small ribosomal subunit forms a peptide bond between them.
The ribosome moves along the mRNA, exposing the next three bases on the ribosome, and a third tRNA molecule brings a third amino acid, which joins to the second one.
3. Termination: The polypeptide chain continues to grow until a stop codon (UAA, UAC or UGA) is exposed on the ribosome. The stop codon codes for a releasing factor instead of another aa-tRNA; the completed peptide is released, and components of the translation complex are disassembled.
DNA Methylation
DNA methylation plays an important role in regulating gene expression, generally preventing (‘silencing’) gene expression. The addition and removal of methyl groups on DNA and histone proteins controls three-dimensional chromatin structure to allow or prevent binding of transcriptional promoters; interaction between DNA methylation and other proteins that modify nucleosomes results in a mechanism that regulates gene expression so that the correct genes are expressed at the appropriate time.
DNA is methylated by adding a methyl group (–CH3) covalently to the base cytosine (C), usually within the dinucleotide 5 ́-CpG-3 ́; methylated cytosine residues are sometimes referred to as the ‘fifth nucleotide.’ The CpG sequence in DNA represents a cytosine base (C) linked to a guanine base (G) by a phosphate bond. The majority (70–80%) of CpG dinucleotides in the human genome are methylated. Those that are unmethylated are usually clustered together in ‘CpG islands,’ sequences of at least 200 base pairs that contain a higher number of CpG sites than expected. They are usually found upstream of many mammalian genes, in regions that facilitate transcription, the ‘promoter’ sites. More than 98% of DNA methylation in somatic cells is found within CpG islands, usually concentrated within central cores of nucleosomes rather than internucleosomal regions. However, in embryonic stem (ES) cells around 25% of methylation appears in non-CpG regions. The promoter regions of important genes that are expressed in most cells (‘housekeeping genes’) are mainly protected from methylation, allowing continued transcription of genes that maintain basic cellular homeostasis. Methylation of CpG islands blocks transcription, silencing expression of the related gene(s) (Figure 1.6).
Figure 1.6 Effect of oxidation on epigenetic regulation of gene expression. Simplified scheme outlining the effect of DNA methylation on transcription. The transcription factor (TF) binds to CpG promoter islands via a hydrophilic interaction, allowing downstream gene expression. The addition of a methyl group to CpG converts the molecule to one that is hydrophobic, silencing gene expression by preventing TF binding. Oxidation of guanine (G) at methylated CpG sites modifies the interaction to one that is again hydrophilic, restoring affinity between TF and DNA to allow gene expression.
The addition of methyl groups is controlled at several different levels by a family of DNA methyltransferase enzymes (DNMTs); DNMT1, DNMT3a and DNMT3b establish and maintain DNA methylation patterns. Variants in genes encoding DNMTs have been identified as risk factors for disease (Tollefsbol, 2017).
DNMT1 appears to maintain established patterns.
DNMT3a and DNMT3b mediate new or de novo patterns.
DNMT2 and DNMT3L may also have more specialized roles.
Demethylation, i.e., removal of methyl groups, is essential for epigenetic reprogramming of genes and is also directly involved in disease mechanisms that cause cell transformation to malignant states. The process can be passive, active or a combination of both.
Passive: usually takes place via DNMT1 during rounds of replication, on newly synthesized DNA strands, with a downstream ‘dilution’ effect during subsequent replication rounds.
Active: can be carried out by two different mechanisms:
1. Cytosine deaminases convert 5mC to thymine, followed by T–G mismatch repair that specifically replaces thymine with cytosine.
2. A family of ten-eleven translocation (TET) hydroxylases bind to CpG-rich regions to prevent DNMT activity, and convert 5mC to oxidized bases: 5-hmC, 5hmC to 5-formylcytosine (5fc), and 5-fc to 5-carboxylcytosine (5-caC) through hydroxylase activity. These oxidized 5mC bases can be actively removed by base excision repair (BER) to regenerate cytosine (Figure 1.7).
The TET proteins have been shown to participate in activating and repressing transcription (TET1), tumor suppression (TET2), and DNA methylation reprogramming (TET3). Oxidation by TET appears to be important in preventing the accumulation of 5mC at CpG islands and other promoter sites. TET deficiency affects methylation, with downregulation of the related genes.
Figure 1.7 DNA methylation in mammalian cells. The nucleotide cytosine (C) is methylated at the 5th carbon either by de novo DNA methyltransferases DNMT3A or DNMT3B, or by the DNA maintenance methyltransferase 1 (DNMT1) during DNA replication; active demethylation of 5-methylcytosine (5mC) takes place through repeated oxidation by ten-eleven translocation proteins (TET1/2), producing 5-hydroxymethylcytosine (5hmC). This is further oxidized to 5-formylcytosine (5fC) and lastly 5-carboxylcytosine (5caC). By an alternative route, not only 5caC, but also 5hmC, are deaminated to thymine and excised by thymine DNA glycosylase (TDG). The mismatched bases are then repaired by the base excision and/or nucleotide excision repair machinery (BER/NER).
Demethylation is a crucial process during development, and dysregulation of methylation processes contributes to numerous disease states, including cancer; it also occurs due to adverse environmental influences and as a function of aging. Methylation capacity/activity usually decreases with age (Richardson, 2003).
The distribution of DNA methylation marks on the genome encodes important biological information that is crucial for early development. DNA analysis technology has progressed significantly during the last decade, and this has revealed that the dynamics of DNA methylation/demethylation during early preimplantation development are highly complex and intricate with respect to removal and re-establishment of imprinting marks (Okamoto et al., 2015). The establishment of methylation patterns in the zygote is a highly dynamic process, involving both active and passive demethylation, in tandem with de novo and maintenance methylation: methylation and demethylation processes are counterbalanced. By the time of implantation, imprinting marks from the parent gametes have been removed, and the entire genome undergoes methylation at specific sites, while CpG islands are protected. This results in global repression, but with continued expression of cellular housekeeping genes, which have a unique CpG island promoter structure that remains unmethylated in every cell. Further stage- and tissue-specific changes in methylation then mold epigenetic patterns for each individual cell type during postimplantation development, and these are maintained through cell division. Many factors regulate and contribute to determining the precise methylation/demethylation patterns, and perturbation of one process is likely to affect other processes in the chain, with downstream effects on cell fate conversion. Figure 1.6 describes the effects of oxidative stress on gene expression (Ménézo et al., 2016).
Metabolism in the Mammalian Cell
Four basic factors influence the metabolic activity of a cell:
1. Spatial: compartmentation, permeability, transport, interactions.
2. Temporal: products become substrates, positive and negative feedback.
3. Intensity/concentrations: precise amounts of reactants/substrates/products.
4. Determinants that specify the structure of enzymes and direct their formation/activation.
Molecules that are important in the biology/metabolism of the cell include carbohydrates, fats, lipids and proteins.
Carbohydrates
Carbohydrates are made up of carbon (C), hydrogen (H) and oxygen (O), with the molecular ratio Cx(H2O)y.
Monosaccharides: pentose – 5 C’s (ribose, deoxyribose); hexose – 6 C’s (glucose, fructose).
Disaccharides: two monosaccharides (sucrose, maltose, lactose).
Oligosaccharides: combine with proteins and lipids to form glycoproteins and glycolipids, important in cell–cell recognition and the immune response.
Polysaccharides: polymers, insoluble, normally contain 12 to 10 000 monosaccharides (starch, cellulose, glycogen)
– Also form complexes with lipids and phosphate.
Fats and Lipids
Fatty acids (FAs) have a long hydrocarbon chain ending in a carboxyl group:
Saturated FAs have single bonds between carbon atoms.
Unsaturated FAs have some double bonds between carbon atoms.
Lipids are made up of FAs plus water:
Phospholipids are important in membranes.
Glycolipids are important in receptors.
Proteins
The primary structure of a protein is a sequence of amino acids with peptide bonds:
–CONH–
Amino acids have at least one amino and one carboxyl group; they are amphoteric and form dipolar zwitterions in solution. Proteins have secondary structures; they can be folded into a helix or form beta sheets that are held together by hydrogen bonds:
Alpha helix – tends to be soluble (most enzymes).
Beta sheets – insoluble – fibrous tissue.
Proteins also have a three-dimensional tertiary structure, which is formed by folding of the secondary structure, held in place by different types of bond to form a more rigid structure: disulfide bonds, ionic bonds, intermolecular bonds (van der Waals – non-polar side chains attracted to each other).
High temperatures and extremes of pH denature proteins, destroying their tertiary structure and their functional activity.
Some proteins have a quaternary structure, with several tertiary structures fitted together; e.g., collagen consists of a triple-stranded helix.
Enzymes are proteins that catalyze a large number of biologically important actions, including anabolic and catabolic processes, and transfer of groups (e.g., methylase, kinase, hydroxylase, dehydrogenase). Some enzymes are isolated in organelles, others are free in the cytoplasm; there are more than 5000 enzymes in a typical mammalian cell.
Kinases: add a phosphate group, key enzymes in many activation pathways.
Methylases: add a methyl group. DNA methylation is important in modifications that are involved in imprinting, lipid methylation is important for membrane stability, and proteins are also stabilized by methylation.
Most enzymes are conjugated proteins, with an active site that has a definite shape; a substrate fits into the active site or may induce a change of shape so that it can fit.
The rate of an enzymatic reaction is affected by temperature, pH, substrate concentration, enzyme concentration.
Enzymes can be activated by removal of a blocking peptide, maintaining the S–H groups, or by the presence of a cofactor.
The active site of an enzyme is often linked to the presence of an amino acid OH– group (serine, threonine). Mutations at this level render the enzymes inactive.
Competitive – structurally similar
Noncompetitive – no similarity, form an enzyme/inhibitor complex that changes the shape of the protein so that the active site is distorted
Irreversible: heavy metal ions combine with –SH causing the protein to precipitate. Lead (Pb2+) and cadmium (Cd2+) are the most hazardous; these cations can also replace zinc (Zn2+), which is usually a stabilizer of tertiary structures.
Allosteric enzymes are regulated by compounds that are not their substrate, but which bind to the enzyme away from the active site in order to modify activity. The compounds can be activators or inhibitors, increasing or decreasing the affinity of the enzyme for the substrate. These interactions help to regulate metabolism by end-product inhibition/feedback mechanisms.
For example, low levels of ATP activate the enzyme phosphofructokinase (PFK), and high levels of ATP then inhibit the reaction
Km is the substrate concentration that sustains half the maximum rate of reaction. Two or more enzymes may catalyze the same substrate, but in different reactions; if the reserves of substrate are low, then the enzyme with the lowest Km will claim more of the substrate.
Cytokines are proteins, peptides or peptidoglycan molecules that are involved in signaling pathways. They represent a large and diverse family of regulatory molecules that are produced by many different types of cell, and are used extensively in cellular communication:
Colony stimulating factors
Growth and differentiation factors
Immunoregulatory and proinflammatory cytokines function in the immune system (interferon, interleukins, tumor necrosis factors).
Each cytokine has a unique cell surface receptor that conducts a cascade of intracellular signaling that may include upregulation and/or downregulation of genes and their transcription factors.
They can amplify or inhibit their own expression via feedback mechanisms:
Type 1 cytokines enhance cellular immune responses:
Interleukin-2 (IL-2), gamma interferon (IFN-γ), TGF-β, TNF-β, etc.
Type 2 favor antibody responses:
IL-4, IL-5, IL-6, IL-10, IL-13, etc.
Type 1 and type 2 cytokines can regulate each other.
The majority of our knowledge about fundamental principles of cell and molecular biology has been gained from model systems, particularly in yeast and bacteria, as well as human cell lines maintained in tissue culture. The first human cell line to be propagated and grown continuously in culture as a permanent cell line is the HeLa cell, an immortal epithelial line: knowledge of almost every process that takes place in human cells has been obtained through the use of HeLa cells, and the many other cell lines that have since been isolated.
The cells were cultured from biopsy of a cervical cancer taken from Henrietta Lacks, a 31-year-old African American woman from Baltimore, in 1951. George Gey, the head of the cell culture laboratory at Johns Hopkins Hospital, cultivated and propagated the cells; Henrietta died from her cancer 8 months later. Gey and his wife Margaret continued to propagate the cells, and sent them to colleagues in other laboratories. In 1954, Jonas Salk used HeLa cells to develop the first vaccine for polio, and they have been used continually since then for research into cancer, AIDS, gene mapping, toxicity testing and numerous other research areas; they even went up in the first space missions to see what would happen to cells in zero gravity.
HeLa cells attained ‘immortality’ because they have an active version of the enzyme telomerase, which prevents telomere shortening that is associated with aging and eventual cell death. They adapt readily to different growth conditions in culture, and can be difficult to control: their growth is so aggressive that slight contamination by these cells can take over and overwhelm other cell cultures. Many other in-vitro cell lines used in research (estimates range from 1 to 10% of established cell lines) have been shown to have HeLa cell contamination. Twenty-five years after Henrietta’s death, many cell cultures thought to be from other tissue types, including breast and prostate cells, were discovered to be in fact HeLa cells, a finding that unleashed a huge controversy and led to questions about published research findings. Further investigation revealed that HeLa cells could float on dust particles in the air and travel on unwashed hands to contaminate other cultures.
The cells were established in culture without the knowledge of her family, who discovered their ‘fame’ accidentally 24 years after her death; they were contacted for DNA samples that could be used to map Henrietta’s genes in order to resolve the contamination problem (Skoot, 2010).
Ion Regulation in Cells
All cells maintain a different cytoplasmic ionic constitution with respect to their environment, regulated by the hydrophobic lipid membrane bilayer and by ion channels and transporters associated with the membrane. A major proportion of cellular energy is dedicated to ionic homeostasis, and loss of membrane-controlled ionic imbalance is one of the first manifestations of cell death. Although the cell cytoplasm is electrically neutral, i.e., it contains equal quantities of positive and negative charges, the differential distribution of ions across the cell membrane forms an electrochemical gradient that creates potential energy. Approximately 15–30% of all membrane proteins are involved in ion transport, via two mechanisms:
1. Ion channels form a narrow hydrophilic pore that allows passive movement of small inorganic ions.
2. Transporters actively transport specific molecules across membranes; these may be coupled to an energy source.
Ion Transport
Ion Channels
Ions can be transferred thousands of times faster through an ion channel than via transporters: 108 ions can pass through an open channel in one second. This ‘passive’ transport through channels is not directly linked to energy sources and is often specific for a particular type of ion. Ion channels are not continuously open; they are opened or ‘gated,’ in response to a change in voltage, mechanical stress or the binding of a ligand. The activity of many ion channels is also regulated by protein phosphorylation and dephosphorylation.
The resting potential of a cell membrane can be calculated from the ratio of internal to external ion concentrations, using the Nernst equation:
R = universal gas constant, 8.314 J K−1 mol−1 (joules per kelvin per mole).
T = temperature in kelvin (K =°C + 273.15).
z = valence of the ion, e.g., z = +1 for Na+ and K+, +2 for Ca2+, −1 for Cl−, etc.
F = Faraday’s constant, 96485 C mol−1(coulombs per mole).
[X]out = extracellular ion (X) concentration in mM.
[X]in = intracellular ion (X) concentration in mM.
The resting potential depends mainly on the K+ gradient across the membrane, as well as the characteristics of its K+ ion channels. The plasma membrane of many cells also contains voltage-gated cation channels, which are responsible for depolarizing the plasma membrane, creating a less negative value inside the cell. Voltage-gated Na+ channels allow a small amount of Na+ to enter the cell down its electrochemical gradient; this then depolarizes the membrane further, opening more Na+ channels that may continue to open by auto-amplification. Voltage-gated sodium channels are primarily responsible for propagating action potential in neurons. Ion channels are specific for particular ions:
1. The sodium channel family consists of nine members, each with two subunits (α and β) as well as several membrane-spanning regions.
2. Potassium channels have a tetrameric structure consisting of four subunits.
3. Voltage-gated calcium channels are complex, with α1, α2δ, β1–4 and γ subunits, and these form four common types: L-type, N-type, P/Q type, R-type and T-type.
4. Voltage-gated chloride channels also exist, and these play a role in resetting the action potential caused by the opening of other voltage-gated channels.
Another gene family embraces Cl− channels and a class of ligand-gated channels activated by ATP. Ligand-gated ion channels are relatively insensitive to the membrane potential and therefore cannot by themselves produce a self-amplifying depolarization. The acetylcholine receptor is the best example of a ligand-gated ion channel; this was the first channel to be sequenced. The acetylcholine receptor of skeletal muscle is composed of five transmembrane polypeptides encoded by four separate genes, and is non-specific for ion selectivity. Na+, K+ and Ca2+ may pass through the acetylcholine-gated channel.
Transporters or Pumps
Transporters are long polypeptide chains that cross the lipid bilayer several times and transfer bound solutes across the membrane either passively or actively. Transporters are often called pumps since they are able to ‘pump’ certain solutes across the membrane against their electrochemical gradients. This active transport is tightly coupled to a source of metabolic energy such as an ion gradient or ATP hydrolysis. The solute binding sites are exposed alternately on one side of the membrane and then on the other. There are three categories of transporters:
Antiporters, pH and Ca2+ Regulation
Enzymes in mammalian and marine cells require a pH of around 7.2 in order to function correctly, and the correct cellular pH is maintained via one or more Na+-driven antiporters in the plasma membrane; these use energy stored in the Na+ gradient to pump excess H across the membrane:
1. The Na+/H+ exchanger couples an influx of Na+ to an efflux of H+.
2. The Na+-driven Cl−/HCO3− exchanger couples an influx of Na+ and HCO3− to an efflux of Cl− and H+.
Free cytosolic calcium must be maintained at very low levels in all cells, and Ca2+ is actively pumped out of the cell by Ca2+ ATPase and a Na+/Ca2+ exchanger.
The Na+/K+ Pump
The concentration of K+ is typically 20 times higher inside cells than outside, whereas the reverse is true for Na+. The Na+/K+ pump maintains these concentration differences by actively pumping Na+ out of the cell against its steep electrochemical gradient and pumping K+ inside. This pump is vital for survival: it has been estimated that 30% of cellular energy is devoted to maintaining its activity. The pump is an enzyme, and it can work in reverse to produce ATP. Electrochemical gradients for Na+ and K+ together with relative concentrations of ATP, ADP and phosphate in the cell determine whether ATP is synthesized or Na+ is pumped out of the cell. This pump is also involved in regulating osmolarity.
Calcium Regulation
Extracellular fluids contain millimolar quantities of calcium, whereas the cytoplasm contains nanomolar levels; therefore, calcium ions tend to enter cells by diffusion. Several pumps operate to maintain low cytoplasmic Ca2+ levels:
1. A 110-kDa Ca2+-transport ATPase on the endoplasmic reticulum (ER) membrane lowers cytoplasmic calcium.
2. A Na+/Ca2+ exchanger on the cell plasma membrane pumps calcium ions out of the cell.
3. Calcium is also sequestered into the mitochondrial matrix.
Calcium is stored on the ER bound to several proteins, including calsequestrins and calreticulin. In order to use intracellular calcium gradients as messengers for signaling, cells must employ two further mechanisms:
1. A mechanism that enables a short burst of calcium to be released into the cytoplasm in response to other signals.
2. A mechanism that can ‘read’ these signals and translate them into cellular signals: receptor-operated calcium channels on the calcium stores and proteins that respond to calcium signals cause a cascade of phosphorylation/dephosphorylation reactions that translate into specific activities.
Four groups of calcium channels have been recognized thus far:
1. Voltage-gated calcium channels were discovered in cardiac and neuronal cells; major groups in this category are L- and T-type voltage-dependent calcium channels. Voltage-dependent calcium channels are found in the oocyte plasma membrane.
2. Receptor-operated calcium channels include the N-methyl-D-aspartate (NMDA) receptor found in neuronal tissue and the ATP receptor found in smooth muscle.
3. Second messenger-operated calcium channels include the inositol trisphosphate (IP3) and calcium-induced calcium release (CICR)-activated group of channels.
4. Calcium channels have been found that are sensitive to physical forces and stretching; these may regulate cell size and response to injury.
The IP3 and CICR receptors are of major interest in fertilization and early development.
A receptor-operated calcium channel known as the ryanodine-sensitive calcium release channel (due to its sensitivity to the plant alkaloid ryanodine) was discovered in skeletal muscle. This receptor consists of a tetrameric unit with 450-kDa protein monomers and a ryanodine-binding site, a Ca2+-release channel and a membrane-spanning domain. The activity of the channel is enhanced by caffeine, adenine nucleotides and calcium itself; ruthenium red and procaine act as inhibitors. The ryanodine-sensitive calcium channel is thought to be responsible for IP3-insensitive calcium-induced calcium release in many systems, suggesting a further sensitivity to calcium itself.
The inositol trisphosphate (IP3)-sensitive calcium release channel is found in nonmuscle cells, discovered initially as a channel gated by hormone–ligand interactions on the cell surface. The IP3 receptor is a tetramer of 260-kDa subunits, and IP3 receptors are found in the oocytes of many species, as well as in neurons. The IP3 receptor releases calcium in response to IP3 and related molecules. Heparin is a known inhibitor of IP3-induced calcium release. Interestingly, both the IP3 receptor and the ryanodine channel show a bell-type sensitivity to calcium: the channel is first sensitized, and then desensitized in the presence of increasing amounts of Ca2+. These data suggest that a small amount of calcium release has a positive feedback effect on further calcium release, which eventually stops through both emptying of stores and channel desensitization. This property of calcium channels helps to explain the ‘calcium spike’ phenomenon observed in many cell types.
Oocytes, in common with other cells, have three fundamental calcium release mechanisms:
1. Calcium influx from the external milieu can be regulated through voltage-dependent calcium channels in the plasma membrane.
2. Inositol trisphosphate produced within the cell binds to a receptor-operated calcium channel on the ER, causing calcium to be released from internal stores.
3. The CICR mechanism in which calcium itself causes a further release of calcium, either by sensitizing the IP3 receptor to IICR or through the action of a third channel, the ryanodine receptor.
The potent calcium-releasing activity of cyclic ADP-ribose (cADPr) was initially discovered in sea urchin oocytes, and in 1989 cADPr was found to be the natural ligand for the ryanodine receptor in nonmuscle cells. cADPr is a metabolite of NAD+ and has now been shown to be an active calcium-releasing metabolite in many species. However, other calcium-releasing metabolites such as NAADP+, derived from NADP+, have also been shown to possess calcium-releasing properties, suggesting that metabolites of nicotinamide form a family of calcium-releasing second messengers. cADPr is produced through the activity of adenosine diphosphate-ribosyl cyclases, which are in turn regulated by levels of cyclic guanosine monophosphate (cGMP), and may be produced in response to hormonal stimulation in some cell types such as pancreatic β-cells.
Calcium transients or ‘spikes’ in the cell cytoplasm were first measured in the 1970s using calcium-sensitive aquaporin proteins that released light in the presence of calcium together with calcium-sensitive fluorescence dyes such as Fura-2 and Fluo 3. These proteins clearly demonstrated transient increases in intracellular calcium but gave little information on the properties of these spikes. The introduction of two-dimensional cell imaging such as photo-imaging detectors and confocal microscopy allowed the properties of calcium peaks to be observed in many cell types. Calcium spikes are now known to either remain localized to distinct regions of the cell cytoplasm or cross the cytoplasm in the form of a wave. Waves of calcium release are common to many cell types and are especially common in oocytes of many species during fertilization. Interestingly, calcium increases simultaneously throughout the whole cytoplasm in some cells, and different species have distinct mechanisms for the formation of calcium waves or the simultaneous increase in calcium. However, one common feature underlies the calcium transients: they are produced either by IP3-induced mechanisms, CICR-induced mechanisms – or both.
Calcium spikes regulate cell activities through the action of two major proteins: calmodulin and calcium/calmodulin-dependent protein kinase II (CaMKII). Calmodulin is a 16-kDa ubiquitous calcium-binding protein that mediates a host of cell processes in many cell types in response to a calcium signal. Most cellular proteins are unable to bind to calcium itself, and therefore calmodulin acts as the primary messenger of calcium signals. Calmodulin has four sites for calcium binding and undergoes a conformational change when calcium is bound. Its function does not depend on all four sites being occupied, suggesting that different levels of cytoplasmic calcium can activate diverse processes. The relevance of calmodulin during the cell cycle is inferred from its spatial localization: during interphase it is localized throughout the cytoplasm but migrates to the mitotic apparatus during M-phase.
The calcium/calmodulin-dependent protein kinases (CaM kinases) are a series of serine/threonine protein kinases activated by calmodulin. These kinases are oligomeric proteins with diverse subunits of approximately 50–60 kDa that form a complex of between 300 and 600 kDa, depending on cell type. The CaM kinase is characterized by a ‘memory’ effect, i.e., the activation of the kinase supersedes the presence of Ca2+/calmodulin. Two types of CaM kinase exist:
1. Specialized CaM kinases, e.g., myosin light chain kinase (MLCK), involved in muscle contraction.
2. Multifunctional CaM kinases, e.g., CaM kinase II, involved in a variety of cellular processes; it is relatively nonspecific for substrates, leading to the question of how it can manage to organize specific cell processes.
A calcium spike can trigger specific activities in a cell. The calcium spike at fertilization is probably designed to be a large, nonspecific signal that causes several major effects including degrading cell cycle blocks, release of cortical granules, upregulation of metabolism, decondensation of the sperm head and activating a developmental program. However, calcium spikes can also trigger specific mechanisms, often achieved through spatial localization of calcium signals, e.g., in sea urchin oocytes after activation. Localized calcium spikes appear in many classes of cells such as neurons, pancreatic cells and embryos.