0% found this document useful (0 votes)
8 views56 pages

genetics II final notes

The document provides an overview of key genetic concepts including chromosomes, genes, alleles, and DNA structure. It discusses various experiments that established DNA as the genetic material, methods of DNA manipulation like cloning and sequencing, and the processes of mutation and recombination. Additionally, it covers gene expression, RNA processing, and techniques for studying DNA-protein interactions.

Uploaded by

keweyik825
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views56 pages

genetics II final notes

The document provides an overview of key genetic concepts including chromosomes, genes, alleles, and DNA structure. It discusses various experiments that established DNA as the genetic material, methods of DNA manipulation like cloning and sequencing, and the processes of mutation and recombination. Additionally, it covers gene expression, RNA processing, and techniques for studying DNA-protein interactions.

Uploaded by

keweyik825
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Chapter 1

Chromosome: a distinct unit of the genome carrying many


genes, composed of a very long duplex DNA molecule only
visible during cell division.
Structural gene: a gene that codes for an RNA or any
polypeptide other than a regulator.
Allele: one of several alternative forms of a gene occupying a
given locus on a chromosome.
Locus: The position on a chromosome where the gene for a
particular trait resides; it may be occupied by any one of the
alleles for the gene.
Genetic recombination: a process by which separate DNA
molecules are joined into a single molecule due to such
processes as crossing over or transposition.

Transformation experiments:
● Showed for the first time that DNA is the genetic material
of bacteria.
● Mice were injected with healthy S, healthy R, dead S, or
healthy R + dead S bacteria.
○ S bacteria are lethal, while R bacteria are not.
● Healthy S = dead mice, healthy R = living mice, healthy R
+ dead S = dead mice, dead S = living mice.
● This showed that there must be a “transforming principle”
in which the healthy R bacteria turned into S bacteria when
dead S bacteria were around.
○ DNA was later found to be the transforming agent.

Phage studies:
● Phages with radioactively labeled S (protein label) and P
(DNA label) infected bacteria.
● The infected bacteria had mostly the P label, while the
discarded coats had the S label.
● The progeny were found to have mostly the P label but not
the S label.

Nucleoside: purine or pyrimidine linked to 1’ carbon of a


pentose (ribose or deoxyribose).
● Ribose: 2’ -OH
● Deoxyribose: 2’ -H

Nucleotide: nitrogenous base + pentose + phosphate group


linked to either 3’ or 5’ of the nucleoside
Closed DNA molecules:
● Supercoiling: DNA twists around its own axis if circular.
● Closed DNA can either be circular or linear but its ends are
anchored.
● Linking number (L) = twist (T) + writhe (W)
○ Linking number = the number of times one strand of
DNA passes over the other.
○ Twist = number of turns of the double helix, usually
stable. Determined by number of base pairs per turn.
○ Writhe = coils and supercoils of the DNA
○ Linking number can only be changed by breaking and
reforming bonds in the DNA backbone.

DNA as a double helix


● Diameter of the double helix is 20 A.
● There is a complete turn every 34Å, with 10 base pairs per
turn (about 10.4 base pairs per turn in solution).
● The double helix has a major and minor groove.
● The helix is overwound if B-DNA has more than 10.4 bases
per turn, and underwound if it has less than 10.4 bases.

Semiconservative replication: DNA replication takes place by


the parental DNA molecule getting unzipped and each parental
strand being copied using complementary base pairing to form
two new DNA molecules.
● Meselson-Stahl experiment showed that this is the case:
○ The initial DNA were labeled with heavy N isotopes
and then as DNA gets replicated, the atomic weight
was measured, which pointed to a semiconservative
replication model.

Endonuclease: cleaves a bond within a nucleic acid, attacks one


strand of a DNA duplex
Exonuclease: removes bases one at a time by cleaving the last
bond in a polynucleotide chain.

Hybridization: the ability of single stranded DNAs to hybridize


is a measure of their complementarity.
● FISH: labeling parts of a chromosome with fluorescent
RNA molecules.
● Can be intermolecular or intramolecular

Re-annealing kinetics
● Shear and renature DNA, measure the OD at 260 nm to
understand the rate of reassociation.
○ Rate of reassociation = ratio of ssDNA/dsDNA
● Depends on DNA concentration and sequence complexity
○ Higher DNA concentration = higher rate
○ Smaller sequences reassociate earlier.
○ Fragments containing repeats will reassociate earlier
than unique sequences.
● On a graph, renaturation is complete when the relative
absorption reaches 1.00.
○ Smaller and repeat sequences will reach 1.00 earlier
compared to others.
● C/C0 = 1/(1+kC0t) C0t1/2 = 1/k
○ C = concentration of ssDNA at time t C0 = initial
concentration of ssDNA t = time in seconds
○ k = constant
● C0t values: it is lower for highly repetitive and shorter
sequences.
○ Usually non-coding DNA is highly repetitive and has
low C0t values.
○ 50% of DNA is unique (highest C0t values, takes
longest to reanneal)
○ 30% has moderate repeats
○ 20% is highly repetitive

Corrected C0t value of a genome = C0t1 * fraction of DNA with


Cot1 + C0t2 * fraction of DNA with C0t2….

Mutations: changes in DNA sequences


● Can occur spontaneously or induced by a mutagen
● Can be good or bad, accumulation of mutations in certain
genes can lead to cancer.
Point mutation: change of a single base pair.
Polymorphism: change in a DNA sequence that is prevalent in
the population
● Some polymorphic genes have different and abundant
alleles so that there is no true WT gene, such as the
galactotransferase gene determining the A,B,O blood
system.
Transition: A-T to G-C, purine-purine or pyrimidine-
pyrimidine
Transversion: A-T to T-A, purine to pyrimidine changes.

Forward mutation: alters the function of the gene


Back mutations (revertants): restore the original function of
the gene
● Insertions can be reverted by deletions, but deletions cannot
be reverted because that DNA is lost.
● True reversion: restoring the original sequence of DNA.
● Second-site reversion: a second mutation suppresses the
effect of the first mutation in a gene.
● Eukaryotic genome is redundant, thus there is a greater
chance of reversion.
Suppression: when a mutation in a second gene bypasses the
effect of mutation in the first gene.

Hotspot: portions of the genome in which the frequency of


mutations is 10X higher compared to the rest of the genome.
● Most common mutation: deamination of 5-methylcytosine,
which is converted to U and then T.
● Also due to high tandem repeats.

Heteromultimer: a molecular complex composed of different


subunits
Homomultimer: a molecular complex composed of identical
subunits.

One gene : one enzyme hypothesis: the old belief that each
gene codes for a single enzyme.
● Not true due to:
○ Not all genes code for enzymes
○ Alternative splicing allows the same gene to code for
different proteins.
One gene : one polypeptide: the updated version of the above
hypothesis
● Not all genes encode polypeptides, some encode only
RNAs like tRNAs or structural/regulatory RNAs.

Mutations tend to increase with age:


● Embryos are very mutation free. As people age, mutations
may accumulate to form cancer.
● In diploid organisms, one copy of the gene may be mutated
and the other one can remain intact. In this case, usually,
the mutation is recessive to the WT gene and the
functionality remains.

Complementation test: a test to determine whether two


mutations are in the same gene.
● Cross two different recessive mutations.
○ If mutation in different genes: progeny with WT
phenotype because the WT genes in each organism
complements the other gene.
○ If mutation in same gene: failed complementation,
progeny with mutant phenotype.

Loss of function mutations: recessive mutations


Gain of function mutations: dominant mutations
Null mutation: a mutation that completely eliminates the gene’s
function, required to test the gene’s function. Knockouts.
Silent mutations: no phenotypic effect
● Change in polypeptide sequence has no effect or change in
bases don’t lead to a change in the amino acid sequence due
to redundancy of the genetic code.
Neutral substitutions: Substitutions in a protein that cause
changes in amino acids that do not affect activity.

Recombination
Due to crossing over that occurs in chiasmata in meiosis,
chromosomes can exchange portions that lead to further
variation in offspring.
Recombination is also a DNA repair mechanism in which DNA
breaks can be repaired if there is readily available duplex DNA
that is complementary to the region.
Recombination frequency depends on the distance between the
two genes:
● If genes are too close to one another, most likely, there is no
recombination between them and we name them as linked
genes.
● If genes are very far away from each other, their distance is
no longer proportional to recombination frequency since
recombination occurs frequently in the same chromosome.

Genetic code: the genetic code is read in triplet nucleotides


named codons. Codons are non-overlapping and have a fixed
starting point.
● Acridines: mutagens acting on DNA that cause the insertion
or deletion of a single base pair, causing a frameshift.

There are usually three possible reading frames, but only one of
them is translated. Other reading frames are closed with
termination signals.
● Open reading frame is the name of the translated reading
frame. It starts with an initiation codon and ends with a
termination codon.

Each mRNA consists of:


5’ UTR (leader sequence) – coding region (ORF) – 3’ UTR
(trailer sequence)

Chapter 2
Restriction endonuclease: an enzyme that recognizes short
sequences of DNA and cleaves the duplex either at the
restriction site or at another site (type IIS)
● Found initially in bacteria as a defense mechanism against
phage genomes.
● Used to cleave DNA into defined fragments and maps can
be generated by the overlapping regions of the genome cut
with different restriction enzymes.

Cloning vector: DNA derived either from a plasmid or a


bacteriophage genome that can be used to propagate an inserted
DNA sequence in a host cell
● Vectors contain selectable markers that help the researchers
detect whether it was taken in by a cell.

Nucleases: enzymes that hydrolyze an ester bond within a


phosphodiester bond.
● Endonucleases cleave phosphoester bonds within DNA.
● Exonucleases cleave one phosphoester bond at a time from
the end of the DNA (3’ or 5’ specific).
Phosphatases: enzymes that hydrolyze the ester bond in a
phosphomonoester bond.

Recombinant DNA: a DNA molecule that is composed of DNA


from at least two different sources.
Subclone: The process of breaking a cloned fragment into
smaller fragments for further cloning.
Multiple cloning site (MCS): a part of a vector that contains
many tandem repeats constituting the recognition sites of
different restriction enzymes. The insert can be inserted here.
Transformation: intake of nonviral, exogenous DNA from the
environment, occurs mostly in bacteria and yeast that are
competent.
Blue-white selection: a method to understand whether bacterial
colonies have the desired vector with the desired insert.
● The vector has lacZ gene and in the presence of X-gal, B-
galactosidase enzyme produced by lacZ reacts with X-gal
and gives a blue color. The lacZ gene is usually in the
MCS, so if an insert is in the vector, then lacZ is inhibited
and colonies appear white.
● White colonies are the positive colonies.

Cloning vectors may be bacterial plasmids, phages, cosmids, or


yeast artificial chromosomes.

Reporter genes: fluorescent tags that can be used to measure


tissue-specific expression or promoter activity

Methods of DNA release to cells:


● Infection with viruses
● Liposomes
● Microinjection
● Nanospheres can be shot into the cell with a gene gun.

Probe: a radioactive nucleic acid used to identify a


complementary fragment
● Autoradiography: a method of capturing an image of
radioactive materials on film.
● In situ hybridization: hybridization of a probe in intact
tissue with its complementary fragment.

DNA separation techniques


● Gel electrophoresis: by size
● Density-gradient centrifugation
DNA sequencing
● Sanger sequencing and capillary electrophoresis: DNA
sample & DNA replication machinery is supplemented with
dNTPs and a small concentration of fluorescent tagged
ddNTPs. During replication, if ddNTPs bind to the growing
strand, replication stalls and different sized fragments form.
Then, they can be separated based on size with capillary
electrophoresis and which ddNTP they are can be
understood based on their absorbance spectra.
● Next generation sequencing: a method that increases
automation and decreases time spent. Important for
sequencing whole genomes.

PCR: exponential amplification of a desired DNA sequence


determined by primers that anneal to the sequence of interest.
● Uses cycles of denaturation and reannealing to amplify
DNA.
RT-PCR: use of reverse transcriptase to generate DNA from
RNA to use in PCR.
● Also called quantitative PCR, this technique is used to
measure the amount of RNA (cDNA) present in a sample
and is informative of the expression level of a gene of
interest.

Southern blotting
● DNA is run in a gel and transferred to a membrane.
● The membrane-bound DNA is hybridized with a
radioactive-labeled probe.
● In the membrane, the radioactively labeled DNA fragments
are visualized.

Northern blotting: RNA instead of DNA, similar to southern


blot
Western blotting: separation of proteins in an SDS gel and
transfer of proteins to a membrane, which are later targeted and
visualized by antibodies.
Epitope tag: A short peptide sequence that encodes a
recognition site (“epitope”) for an antibody, typically fused to a
protein of interest for detection or purification by the antibody.

DNA microarrays
● Uses “spots” of known DNA sequences and sample DNA is
added on these spots to see whether any hybridization is
there.
● Immobilized spots form an array
● Radioactively labeled sample DNA is added on the spots
and hybridization occurs.
● If hybridization occurs in that spot, then the radioactive
label can be visualized with autoradiography.
● Now, fluorescence labeling has taken the place of
radioactive labeling.

Gene expression
profiling: a type of DNA
microarray that is used to
determine the relative
expression of various
genes.
DNA microarrays are also used for SNP detection and copy
number changes.

Chromatin immunoprecipitation: a technique used to identify


specific protein-DNA interactions in vivo.
● ChIP on chip/ ChIP-seq is used to identify all sites of the
genome that can be bound by that protein.
○ In the cells, proteins are crosslinked with DNA and
then targeted with beads conjugated to antibodies. The
beads are then precipitated, which pulldown the
protein-DNA complex as well.

Transgenics: Organisms created by introducing DNA prepared


in test tubes into the germline.
Cre/lox system: used to make inducible knockouts or knockins
● Knockout: gene function is eliminated by usually replacing
most of the coding sequence with a selectable marker in
vitro and transferring the altered gene to the organism by
homologous recombination.
● Knockin: introduction of more subtle mutations in a method
similar to knockout.

Chapter 3

Interrupted gene: a gene in which coding sequence is not


continuous due to the presence of introns.
Primary (RNA) transcript: the original, unmodified RNA
product corresponding to a transcription unit.
RNA splicing: the process in which introns are removed and
exons are spliced back together to form a continuous mRNA.

Intron: a segment of DNA that is transcribed but later removed


and not included in the mature mRNA product.
● Mature transcript also includes modifications in the 5’ and
3’ ends.
● Mutations in introns can affect RNA processing and
therefore affect the polypeptide product. But they may not
have an effect either.

Chargaff’s Rules
● First parity rule: A pairs with T, C pairs with G.
● Second parity rule: amount of A is roughly equal to amount
of T, amount of C equal to G.
○ Introns have more structured stem-loop segments.
● Cluster rule: purines tend to be clustered in one strand of
the duplex DNA, the non-template strand.
● GC content of the genome is a species-specific
characteristic.
○ Exons usually have greater GC content.

Introns’ organization can be conserved across species. Their


organization can be observed by restriction digestion, electron
microscopy and sequencing experiments.
● The lengths of introns may vary greatly but their locations
are mostly conserved.

Negative selection: selection that ensures the stability of


organisms by eliminating deleterious mutations. It causes the
elimination of newly-arising but dysfunctional variants.
● While the sequences of exons are heavily conserved, the
sequences of introns are not under such a pressure, so
introns are mutated much more frequently.
Positive selection: a new, more advantageous variant is selected
for. It causes diversity.
● In these cases, due to genomic properties like stem-loops,
introns evolve more slowly compared to exons. So, introns
are conserved.
cDNA: complementary DNA synthesized in vitro by reverse
transcriptase from an RNA template

Different eukaryotic species have different intron counts and


lengths.
● Yeast have almost no introns in most genes, whereas
animals and humans do.
● Exon lengths are similar across these species.
● Introns are usually short in unicellular or less complex
eukaryotes, but their size reaches many kb in more complex
eukaryotes.
○ Overall length of a gene is mostly determined by its
introns.

Overlapping gene: a gene in which a part of its sequence is


found in another gene.

Alternative splicing: polypeptides differing due to the


presence/absence of certain regions can be formed due to
different ways exons can be spliced.
● Some exons can be excluded in some splicings.
● Same pre-mRNA gives rise to different polypeptides.

Domain: an independent functional module of a protein, which


can correspond to exons.
● Proteins with similar domains have exon homology as well.

Gene family: A set of genes within a genome that encodes


related or identical proteins or RNAs.
● The members arose due to duplication events. The copies
then faced divergent evolution as they accumulated
different mutations.
● Members are related but usually not identical.
● Globin genes: all have three exons and two introns. This
shared organization indicates they rose from a single
ancestral gene.
● Intron positions are widely different in actin genes,
indicating there is no correspondence between intron
positions and function in this case.
Superfamily: a set of genes all related by presumed common
ancestry but now showing variation.

Genes convey information not only related to the conventional


phenotype but also related to the genomic “phenotype”.
● There are various pressures that are sometimes in
competition with each other as well, namely GC pressure,
fold pressure, purine-loading pressure. These pressures can
cause less optimal amino acids to be coded for the sake of
transmission.
● Order of the genome is important for developing organisms.
This positional information is used for correct
development.
● Sequence is not the only information contained in DNA.

Chapter 4
Open reading frame: a region of DNA with uninterrupted
codons that has the chance of being translated into a protein.
CpG islands: regions of the genome that indicate promoters and
are targets of methylation.

Types of DNA markers


● Single nucleotide polymorphism (SNP)
○ Occurs in 1/1000 bp in noncoding DNA, 1/3000 in
coding DNA. More common in introns.
○ One-nucleotide differences that are prevalent in
approximately 1% of the population.
○ Used to profile diseases and as genome markers.
○ SNP-chips are used to detect SNPs.
■ SNP-chips are short sequences with tagged
sequences, hybridization signal indicates whether
this is a homo or het sample or the copy number.
○ GWAS studies show that some SNPs are more
commonly found in people with certain disorders or
characteristics of interest.
● Restriction-fragment length polymorphism (RFLP)
○ Variation in sites recognized by restriction enzymes
○ The presence of the polymorphism creates a different
fragment (size and number of fragments), so its
presence can be detected after restriction digestion.
■ Detection using Southern blot
○ Was used frequently before sequencing.
○ Codominant
● Tandem repeat polymorphism: 50% of the genome is
repetitive and there is variation in the length of these
repeats.
○ These repeats are amplified by PCR and the number of
repeats is indicated by the length of the PCR product
observed on gel.
○ Two types of tandem repeats: SSR and VNTR
■ SSR: consists of 2-9 bps, highly polymorphic
regions
■ VNTR: 10-60 bps, used especially in DNA
typing, fingerprinting, forensics applications
● Copy number polymorphism (CNPs): extra or missing
copies of genome sized 1 kb to 1 Mb. They are first
detected with DNA microarray and later sequencing.
○ In DNA microarray, fluorescence intensity indicates
copy number.
○ CNPs are located near the coding part of DNA.
Genome: the complete set of sequences in the genetic
information of an organism.
Transcriptome: the complete set of RNA present in a cell or an
organism.
Proteome: the complete set of proteins present in a cell that is
expressed by the entire genome.
Interactome: the complete set of protein complexes and
protein-protein interactions present in a cell.

Linkage maps: maps showing the recombination frequency


between different markers
Restriction maps: physical distances between markers

Haplotype: a particular combination of alleles in a defined


region of a chromosome
DNA profiling: using tandem repeats to find common
inheritance patterns or to characterize people’s differences

Nonrepetitive and repetitive DNA: nonrepetitive DNA usually


contains the coding DNA while the repetitive DNA usually
doesn’t.
● Three classes: nonrepetitive, moderately repetitive, highly
repetitive
○ Nonrepetitive DNA proportion is a good marker of
organism complexity, a better indicator than genome
size.
○ Prokaryotes’ entire genome consists of nonrepetitive
DNA. This is 50% in humans.
● Middle repetitive DNA: composed highly of transposons
○ Transposon: segments of DNA that can move around
to different positions
○ rRNA
○ Histone proteins
○ tRNA
○ These genes are kept in tandem repeat arrays and all
products are synthesized together to meet the demands
of the cell.
● Highly repetitive DNA: also called satellite DNA
○ Can be located with FISH in metaphase chromosomes.
○ Usually found in long tandem arrays around the
centromere, in the heterochromatin region.
○ They can also be dispersed throughout the genome:
■ LINE: they have two ORFs (endonuclease and
reverse transcriptase) and a promoter for RNA
pol II. They are able to transpose autonomously.
■ SINE: they are shorter than LINEs and don’t code
for any genes themselves. They can’t transpose
autonomously.
● Alu is needed to replicate the genome
because it is found in the centromere.
● Transfected DNA can be probed by Alu-
specific probes that lack parts of the
sequence similar to the mouse genome.

Conserved regions indicate which regions can be the coding


regions.

Synteny: homologous genes occur in same order in different


species
Expressed sequence tag (EST): A short sequenced fragment of
a cDNA sequence that can be used to identify an actively
expressed gene.

Mitochondria (and chloroplasts) have their own DNA that are


usually maternally inherited.
● Their genes are called extranuclear genes
● mtDNA is used for inheritance studies.
● Not all proteins in the organelle are encoded in the
organelle’s genes.
● D-loop: a region in animal mtDNA that is variable in size
and sequence and contains an origin of replication.

Chapter 5
Mycoplasma has 470 genes, which is the smallest number of
genes in an organism. Free-living bacteria have around 1500
genes.

In contrast, smallest eukaryotic genomes have 5300 genes.

Plants and animals have around 25000 genes.

Prokaryotic genomes
● 85-90% of their genome consists of coding DNA.
● Genome size is proportional to number of genes.
● Prokaryotes with less than 1.5 Mb genome are parasites that
can live by eukaryotic hosts. They especially miss enzymes
needed for metabolic processes.

Pathogenicity islands: DNA regions present in pathogenic


bacteria but absent in others. They have different GC content
and most likely transfer to other bacteria via horizontal
transfer.

Eukaryotic genomes
● The relationship between genome size and number of genes
is weaker.
● There is no relationship between complexity and gene
number. Plants tend to have more genes than animals due
to ancestral duplications.
● Unicellular eukaryotes have a similar-sized genome as of
some prokaryotes.
○ Yeast have 70% coding DNA.
○ To determine whether a piece of DNA is a functional
unit, its homology with similar organisms’ genomes
are tested. If the gene is functional, an orthologue
should also be present in similar species’ genomes.

Monocistronic mRNA: mRNA that encodes only one


polypeptide (vs. polycistrionic mRNA)

With increasing genome size:


● Proportion of unique genes declines
● Proportion of genes in families increases

Number of types of genes = number of unique genes + number


of gene families
● This number is similar across multicellular eukaryotes even
if their number of total genes is different.

Proteome can be different from the total genes:


● Duplicated genes coding for the same polypeptide
● Some genes can produce more than one polypeptide

Orthologs: genes are found in different species but have greater


than 80% sequence similarity.
Paralogs: set of homologous genes in the same organism that
have diverged.
Pseudogenes: copies of genes that have become nonfunctional
due to accumulation of mutations. Some can act as targets of
regulatory miRNAs.

Human genome
Only 1% of it consists of exons. Exons comprise about 5% of
each gene. Introns form around 25%.
The human genome has fewer genes than expected because of
alternative splicing, alternate promoter selection and post-
translational modifications accounting for some proteins.
● 60% of human genes are alternatively spliced, of which
80% alter the sequence.
Most genes are in the euchromatin, rather than the
heterochromatin.
The current estimate is that humans have 20,000 genes.
Many human genes belong to gene families that have members
in other eukaryotes.

Distribution of genes
There is no uniform distribution of genes in chromosomes.
There are five classes of repetitive sequences that in total
account for 50% of the genes:
● Transposons: they are of viral origin and are able to insert
and duplicate themselves anywhere in the genome. Most
are nonfunctional, but some functional genes have evolved
from transposons and later lost their ability to transpose.
● Processed pseudogenes
● Simple sequence repeats (like CA repeats)
● Segmental duplications: duplication followed by a
translocation. This is the mechanism that leads to
tetraploidy in plants.
● Tandem repeats: found especially in telomeres and
centromeres.

Y chromosome
X and Y have descended from a homologous, autosomal pair of
chromosomes.
● X has retained most of the original genes while Y lost most
of them and now has genes that mostly concern male
characteristics and testis development.
● Most of the Y chromosome does not crossover with the X.
Parts in the Y chromosome:
● Ampliconic regions: repeats that contain some protein-
coding genes
● X-transposed regions: they are largely inactive, containing
only two active genes.
● X-degenerate regions: regions that were once homologous
with the X chromosome but have accumulated many
mutations and deletions ever since.
● Pseudoautosomal regions: regions that align with X
chromosome, cross over occurs here due to the homology.
Mutations here may lead to infertility.
○ Also called male-specific region
● Centromere and heterochromatin
Presence of many copies in the Y chromosome allows for the
copies to undergo recombination to compensate for the lack of
homology between Y and other chromosomes.

Essential genes
Only a few genes are essential, meaning there is a detectable
effect upon the gene’s deletion.
● Even a small disadvantage that cannot be seen in the
phenotype can be disadvantageous for the organism,
leading for the conservation of a seemingly non-essential
gene.
We can do complementation assays to test essential genes.
Some genes may be redundant so that mutation in one can be
compensated for by the other intact copy.
The cell may have two distinct biochemical pathways that give
the same end product as well.
● Synthetic lethality: deletion of either gene is not lethal by
itself but deleting both genes is lethal.
● Genetic load of accumulating mutations is a cost.

Synthetic genetic array analysis (SGA): An automated


technique in budding yeast whereby a mutant is crossed to an
array of approximately 5000 deletion mutants to determine if the
mutations interact to cause a synthetic lethal phenotype.
● Every tested gene had at least one partner.
● This shows that natural selection favors the presence of
these “partners” and is explanatory of why we see so few
essential genes when we mutate just one gene.

Gene expression levels


At a point, 1% of DNA is expressed as mRNA
Abundance: average number of molecules of each mRNA per
cell

We can divide mRNA into two classes:


● Abundant mRNA: present in 1000-10000 copies per cell,
around half of the mRNA mass. Has a small number of
RNA that are copies.
● Scarce (complex) mRNA: large number of sequences
represented by a single copy of mRNA. 90% of scarce
mRNA overlap between different tissues.

Housekeeping (constitutive) gene: genes that are expressed all


the time in all cells of an organism
Luxury gene: genes needed only for particular cell phenotypes
with specialized functions

DNA microarray studies found that around 75% of the yeast


genome is expressed at all times.

Evolution of a DNA sequence


Biological evolution relies on generation of variation and how
that variation is then sorted.
● Variation is the result of mutations.
Transition mutations occur more commonly than transversion
mutations, even though there are two transversion possibilities
and only one transition possibility
● Transitional errors occur more frequently.
● Transversion errors are more frequently corrected by DNA
repair mechanisms because they lead to major DNA
distortions.

Synonymous mutation: a mutation in the coding region that


does not change the encoded amino acid sequence. This is a
subtype of silent mutations that include noncoding region
mutations as well.
● Wobble effect
Nonsynonymous mutation: a mutation that alters the amino
acid sequence by either the presence of a missense codon or the
introduction of an early stop codon (nonsense mutation)

Mutations in noncoding regions can also be selected for. They


can alter the regulation of genes if the mutation is in a regulatory
sequence or they can alter DNA structure and hence gene
regulation.
● Neutral mutation: mutations in noncoding regions that
have no effect on the phenotype of the organism.

Genetic drift: random changes in the frequency of a mutational


variant. Different genetic drifts even out in large populations, so
its effects are not very observable.
● In small populations, new mutations can be eliminated by
chance and frequencies of mutations are more subject to
change.
● Fixation: a new allele replaces the old predominant allele
● Mutations are subject to natural selection eventually.
Dominant alleles are more subject to it faster.
Negative selection results in few variance with respect to a gene
in a population. Positive selection would also decrease variance
in a population, but it may cause greater variation among
different populations if they are isolated from each other.

Not all synonymous mutations are neutral mutations:


● Codon bias: organisms have a different amount of tRNAs
for each codon, so if a mutation leads to a codon with few
tRNAs, that protein’s expression can be altered.
● A codon may also be required to maintain mRNA structure.

Nonsynonymous mutations result in functional changes, often


leading to dysfunction.

Ka (nonsynonymous) / Ks (synonymous) ratio of orthologs


indicates selection on a particular gene:
● Ka/Ks = 1, neutral evolution, amino acid sequence change
is neither favored nor disfavored. Amino acid changes
usually do not change the polypeptide activity.
● Ka/Ks < 1, favors negative evolution and is the most
commonly observed. Amino acid replacements are
disfavored, there is selective pressure to retain the original
functional amino acid in these sites to retain function.
○ For example, histone proteins.
● Ka/Ks > 1 favors positive evolution but is rarely observed.
Amino acid change is advantageous and may become fixed
in the population.
○ This is especially difficult to detect because while a
single region might have a Ka/Ks > 1, its flanking
regions could face negative selection, thus the average
would still be less than 1.

Genetic hitchhiking: The change in frequency of a genetic


variant due to its linkage to a selected variant at another locus.
Reduction in heterozygosity = recent positive evolution, causing
a new allele to be selected

Linkage disequilibrium: A nonrandom association between


alleles at two different loci, often as a result of linkage.
● The linkage is initially very high with other alleles. Then
through mutations and recombination, this linkage
decreases.

Most functional mutations affect gene regulation rather than


proteins.

Molecular clock: measurement of the idea that proteins evolve


at different but stable rates.
● Divergence is measured by the corrected percentage of
positions at which the corresponding nucleotides differ.
● Synonymous mutations accumulate much faster because
they have no cost to protein integrity.
○ May lead to issues related to codon bias: a higher
usage of one codon in an organism rather than other
codons for the same amino acid. The tRNA
concentration for that codon is also higher.

Neutral substitutions can be inferred based on divergence of


repeated sequences.

How did introns evolve?


● “Introns late ” hypothesis: earliest genes didn’t have
introns, they were later added.
● “Introns early” hypothesis: interrupted genes originated
● Interruption enables stem-loop extrusion, which could be
more advantageous for recombination based repair.
● Some introns are mobile and can insert themselves into new
sequences.
● We now think introns were present in the beginning, but
some genes like actin and insulin genes lost them over
time.
○ However, in some cases, introns may have emerged
later on as well, such as in yeast mitochondria.
● Introns are also found in chloroplast genes, indicating that
endosymbiosis occurred later than the loss of introns from
prokaryotes.

Exon shuffling: the hypothesis that genes evolved by the


recombination of various exons encoding functional domains.

RNA world: the hypothesis that the original nucleic acid that
was the information molecule was RNA, and RNA could have
some enzymatic activity so it could replicate itself and catalyze
some more reactions.

Why are some genomes so large?


No correlation between genome size and genetic complexity.
● C-value: the total amount of DNA per haploid chromosome
in a species
○ Within groups, the minimum C-value increases as
complexity increases. However, there is great
variation within taxonomic groups.
● C-value paradox: the lack of a relationship between an
organism’s C-value and its coding potential (morphological
complexity).
○ The relationship is only there for prokaryotes and
lower-order eukaryotes.

Human genes can be classified based on the range of organisms


they are homologous with.
● Most genes unique to vertebrates are concerned with
immune or nervous systems.

Gene duplication
Possible fates for duplicated genes:
● One might become a pseudogene by accumulating
mutations. Pseudogene is inactive and is no longer
translated.
○ It is still homologous to the functional gene.
○ As species get more complex, more pseudogenes are
present.
○ Processed pseudogenes: results from reverse
transcription and insertion of mRNA transcripts.
○ Nonprocessed pseudogenes: incomplete duplication
or mutations on second-copy of a functional gene.
○ Some pseudogenes gain functions like gene regulation.
This is called neofunctionalized or subfunctionalized.
● They might accumulate different mutations and diverge.
They are now templates for other functions.

Globin clusters
All globin genes have a common ancestor that has three exons.
They are results of mutation and duplication events.
● Alpha-globin, beta-globin, myoglobin, leghemoglobin all
descended from this same gene.

Nonallelic gene: 2+ copies of the same gene present in different


locations.
● By definition, alleles are in the same loci.
Most ribosomal protein pseudogenes are of recent origin: they
are common in humans and chimpanzees but not found in other
close relatives like rodents.
● Mice-rat have a lot more in common, compared to human-
rat or human-mice.
Polyploidization: genome duplication so that the chromosome
number increases by a multiple of two.
● Autopolyploidy: polyploidization due to mitotic/meiotic
errors within species.
● Allopolyploidy: hybridization between two different but
reproductively compatible species
● Genome duplication is especially important in plant genetic
engineering.
○ Tetraploidy is common in plants but leads to cancer in
humans.
○ Plant size can be increased with polyploidy.
● 2R hypothesis: early vertebrate genome underwent two
rounds of duplication to create more material.
● Gene duplication events may not be observed because they
are randomly lost after the duplication event.

Transposable elements: DNA sequences of retroviral origin


that can insert itself in different parts of the genome.
● Face negative selection and transposition regulation
mechanisms to maintain amino acid sequence.
● They tend to increase the copy number.

Mutational biases
Mutational bias: mutations are biased to lead to a high AT
content.
● Deamination of cytosine to uracil, which is then converted
to thymine.
● 8-oxoguanine pairing with adenine instead of cytosine.
● There must be some mechanisms to counteract this bias.

Gene conversion bias: biased towards GC pairs.


● May also lead to codon bias.
In-class questions
What is the difference between the genome and the
transcriptome of an organism, and which would you expect to be
cell-specific?

Can two copies of the same gene be regarded as alleles?

What is the one gene-one enzyme hypothesis? Why is it false?

Out of three sequences with known C0t values, which is the


most repetitive?

Why is the mitochondrial DNA often used for animal


phylogenetic studies (examining relationships between groups,
such as species)? What is its advantage over using nuclear
DNA?
● mtDNA is found in multiple copies per cell, but copies with
sequence differences do not generally recombine, so each
variant is like a large single “allele”. Similarity between
variants is more easily interpreted as common ancestry,
without the confounding factor of recombination of
different variants from different ancestries that occurs in
nuclear genes.
Which taxonomic group has the highest correlation between the
number of genes and the size of the genome?
● Prokaryotes

1. In a particular DNA sequence, nonsynonymous substitutions


occurring at a lower rate than synonymous substitutions would
be evidence for:
A) neutral evolution.
B) negative selection.
C) genetic hitchhiking.
D) positive selection.
E) mutational bias.

2. Genes that are expressed only in specific cell types are known
as:
A) luxury genes.
B) housekeeping genes.
C) abundant genes.
D) constitutive genes.
E) scarce genes.

3. In male humans, crossing-over between the X and Y


chromosomes occurs at the:
A) male-specific region.
B) ampliconic regions.
C) X-transposed regions.
D) X-degenerate regions
E) pseudoautosomal regions.

5. The random change in frequency of a genetic variant in a


population is called:
A) negative selection.
B) gene conversion bias.
C) purifying selection.
D) genetic drift.
E) positive selection.

What is the advantage of the use of a DNA microarray over


Southern blotting/probe hybridization?
● We can check the presence of multiple sequences at once
for one sample, which saves a lot of time.

https://quizlet.com/724363271/15-quiz-flash-cards

Chapter 6

Gene family: a set of genes within a genome that encode related


RNA or protein
● The members are resulting from ancient duplication events,
followed by divergent evolution.

Pseudogenes: genes that were once active, but got inactivated


due to accumulating mutations
● They are most likely duplicated and there is an active copy
elsewhere in the genome.

Gene cluster: a group of adjacent genes that are identical or


related

Satellite DNA: highly tandem-repeated DNA


● Minisatellite: copy number of repeats are less than satellite
but more than microsatellite DNA

Unequal crossover (nonreciprocal recombination)


When there are gene clusters of identical or very similar
sequences, sometimes wrong sequences align and lead to
unequal crossover.
● This leads to a deletion in one chromosome and duplication
in the other.
● Thalassemia: inherited blood disorder due to unequal
crossover between the gene clusters of globin genes, which
leads to deletion in one chromosome that is inherited.
● HbH disease: there is a disproportionate amount of the
beta-4 tetramer
● Hydrops fetalis: a fatal disease due to the absence of alpha-
globin gene.
● Hb Lepore: uncommon blood disease due to unequal
crossover between beta and gamma genes, leading to a
fusion gene that produces a fusion protein.
○ Hb Kenya is another variant
● Color blindness: green and red pigment genes are closely
positioned in the X chromosome. Unequal crossover can
cause color blindness.

Genes encoding rRNA are also organized in a cluster.


● These genes are identical and organized as tandem repeats.
● Each ribosomal DNA (rDNA) cluster is organized so that
transcription units giving a joint precursor to the major
rRNAs alternate with nontranscribed spacers
○ Nontranscribed spacer: consists of short, repeating
units that varies among individual spacers
Nucleolus: A discrete region of the nucleus where ribosomes
are produced.
● Nucleolar organizer: The region of a chromosome that
carries genes encoding for ribosomal RNA.
Cryptic satellite: satellite DNA that stays with the main band in

a density gradient.
In situ hybridization experiments are done by denaturing DNA
and then treating it with labeled probes.
● It shows that mouse satellite DNA is present in the
centromeres.

Classes of chromatin
● Heterochromatin: untranscribed regions, tightly coiled.
Satellite DNA constitutes the majority of it.
● Euchromatin: transcriptionally active, contains most of the
single-copy genes that are or can be active. Its coils are less
tight than heterochromatin.

Mammalian Satellites Consist of Hierarchical Repeats

Minisatellites and microsatellites are types of VNTRs, and they


can be used for DNA profiling.
● The number of repeats can differ between alleles, so
individuals can be differentiated based on the length of the
repeated regions.
Chromosomal Rearrangements
Naming chromosomal arms:
● Short arm: p
● Long arm: q

Chromosomes are grouped according to their sizes by naming


them A, B or C.

There are three possible centromere positions for a chromosome:


● Metacentric
● Submetacentric
● Acrocentric

Acentric and dicentric are generally unstable.


Two acrocentric chromosomes fused to form human chr 2.

Chromosome abnormalities in human pregnancies


Euploidy = exact multiplication of haploid number of
chromosomes
Aneuploidy = uneven set of chromosomes
● N + 1, or 2N + 1 etc.

Monosomy = 2n -1
Trisomy = 2n + 1
● Monosomy is usually more harmful than trisomy because
an entire chromosome is missing in monosomy.
Tetrasomy = 2n + 2

For most chromosomes, any abnormalities lead to a spontaneous


abortion and not a live birth.
Trisomy 18 (Edward’s), 13 (Patau) and 21 (Down’s) are three
autosomal abnormalities that are compatible with life, and sex
chromosome abnormalities are also observed:
● Turner’s (XO)
● Klinefelter (XXY)
● XYY

Down’s syndrome
● Most cases are due to nondisjunction, which is when
homologous chromosomes cannot separate during anaphase
I of meiosis.
● Chr 21 is a small chromosome compared to others, hence it
is less likely to go through crossover, which makes it more
difficult for it to align properly in the metaphase plate.
● The abnormal gamete is usually the egg because
nondisjunction is more commonly observed in oogenesis.
Hence, maternal age is a major factor for Down’s syndrome
incidence.

X chromosome aneuploidies
X-inactivation is a result of dosage compensation, but some
genes escape it.

Trisomy X (XXX): Female without mental or physical or


behavior abnormalities. Mild mental impairment may be more
frequent than 46-XX.

Double Y (XYY): Male, tend to be taller than normal males.


May have a higher criminal rate. Some may have slight impaired
mental function.

Klinefelter syndrome (XXY): Male, tend to be taller, no normal


sexual maturation, sterile, mild mental impairment.

Turner syndrome (XO): Phenotypically female but without


sexual maturation. 99% of 45,X fetuses undergo spontaneous
abortion.
● Monosomy example
● Monosomy can be more frequent than trisomy, but they are
less likely to occur in a live birth, so they are not
recognized.

Humans can tolerate X aneuploidy because of X inactivation.


● Still, because some genes escape inactivation, some genes
are expressed at double the amount in Klinefelter males
compared to normal males.

Environmental factors like bisphenol A can increase the rate of


aneuploidies.

Chromosome deletions and duplications


Implications of aberrations in chromosome structures:
● Deletions: removal of a segment of DNA
○ Deletion mapping: a method that crosses a recessive
allele with different deletions to determine the location
of the mutation.
○ Large deletions are usually lethal.
○ There are two paths to chromosomal deletions:
■ Chromosomal break and reunion
■ Ectopic recombination: homologous
recombination of repeated DNA sequences that
are at different locations.
○ Deletions can be detected by PCR and running the
product on gel to see product length.
● Duplications: increase in the copy number of a
chromosomal region
○ Tandem duplication: The duplicated segment is present
in the same orientation immediately adjacent to the
normal region in the chromosome. They often are due
to unequal crossover.
○ Chromosome breakages can produce nontandem
duplications.
● Inversions: 180-degree rotation of a chromosome segment
○ Two-break event
○ Ectopic recombination between inverted repeats.
○ They don’t add or remove DNA but can alter gene
expression.
● Translocations
○ Nonreciprocal:T unequal exchanges between
-

nonhomologous chromosomes, the size of


-

chromosomes change.
-
> ■ Robertsonian translocation: fusion of two

acrocentric chromosomes in the centromere


-

region. This leads to the loss of one chromosome


in karyotyping. The genetic information in the
tips of the fused chromosomes is lost.
● Can lead to familial Down’s syndrome and
other trisomies because one of the parents
has a Robertsonian translocation of Chr 21
and 14, which leads to nondisjunction in
meiosis I.
■ Chronic myelogenous leukemia (CML): cancer
due to translocation between chromosome 9 and
chromosome 22, leading to BCR-ABL fusion
gene in the Philadelphia chromosome.
■ Burkitt’s lymphoma: due to translocation, Myc
gene is positioned under the immunoglobulin
enhancer sequence. This leads to the constitutive
expression of Myc, which is an oncogene.
○ Reciprocal: parts of two nonhomologous chromosomes
eats exchange parts. There is no loss of genetic information
equa but expression can change, due to position effects.
○ Heterozygous
-
translocation: only one pair of
chromosomes have changed parts.
■ Semisterility: due to heterozygous, reciprocal
translocation, half of the offspring will have
normal chromosomes.
○ Homozygous translocation: all chromosomes have
-

exchanged parts.
● Transposition: movement of short chromosome segments
from one chromosome to another.

Ectopic recombination vs. unequal crossover


Similarities:
● Both are causes of genetic variability, they cause
duplication and deletion events.
● In both, the exchanged genetic material should be
homologous.
● Both occur in non-allelic regions
● They are caused by repeats in the genome.

Differences:
● Ectopic recombination is not restricted to meiosis and can
occur among non-homologous chromosomes.
● Unequal crossover occurs only during meiosis I and among
homologous chromosomes.

To diagnose deletions, FISH probes that target that specific


deleted region can be used. If the chromosome is not tagged,
then the deletion must be present.

Balancer chromosomes
Balancer chromosomes are used in genetic applications. They
are designed to have multiple, overlapping inversions. They
have three important qualities:
● They suppress recombination with their homologs.
● They carry dominant markers.
● Negatively affect reproductive fitness when carried
homozygously.

The balancer chromosome pairs with the chromosome with the


mutation of interest, and ensures all progeny has the mutated
chromosome with no undesired changes.

Cri du chat: a disorder due to a specific deletion in


chromosome 5. Children have mental retardation and have a cat-
like cry.

Polyploidy
In humans, polyploidy is cancerous, but it is common in plants.
Polyploid individuals can reproduce sexually or asexually.

Polyploidy in plants:
● Usually leads to larger plants

Amphidiploid: has two diploid parental species, the plant itself


is sterile.

Chapter 7
Nucleoid: the structure in a prokaryotic cell that contains the
chromosome
Packing ratio: the ratio of the length of DNA to the unit length
of the fiber containing it.

Viral genome sizes are constrained by the volume of the capsid,


which is the protein shell surrounding the nucleic acid.

Even when chromosomes are histone-depleted, they have a


protein scaffold that is attached to the loops of the supercoiled
DNA.
Chromatin types:
● Euchromatin
○ Fully decondensed in interphase, ready for expression
● Heterochromatin
○ Chromocenter: an aggregate of heterochromatin from
different chromosomes.
○ Remains condensed in interphase and is
transcriptionally inactive.
○ It is usually the last portion of the genome to be
replicated. The exact reason for this is unknown, but it
could be due to heterochromatin suppressing its own
replication or replication fork moving slower in
heterochromatin.
○ Constitutive heterochromatin: regions that remain
densely-coiled through interphase in all tissues. This is
usually satellite DNA.
○ Facultative heterochromatin: chromatin that is either
densely-coiled or loosely coiled during interphase,
depending on the developmental state or tissue.
■ Inactive X chromosome in females is an example.

Constitutive heterochromatin is found in:


● Centromere
● Subtelomeric regions
● Short arm of acrocentric chromosomes
● Non-expressed portions of the Y chromosome
● Short regions along the length of chromosome arms

Banding patterns of chromosomes


G-bands: bands seen in chromosomes due to some staining
methods.
● Interbands are rich in G-C content and have a higher
concentration of genes.
● G-bands have lower G-C content.
Kinetochores attach the chromosomes to the microtubules of the
mitotic spindle.
● Microtubule organizing centers (MTOCs): centrosome in
animals, forms the mitotic spindle and is the region where
microtubules emerge from.

Acentric fragment: a fragment of chromosome that does not


have a centromere and is lost at cell division because it cannot
attach to the mitotic spindle.

Organization in centromeres

Most nucleosomes in the centromere contain H3, few contain


CENP-A.
● Amphipathic model: CENP-A faces the exterior while H3
remains mostly in the interior.
● Boustrophedon model: based on super-resolution
microscopy.

CEN elements: short DNA sequences in S. cerevisiae that are


essential for plasmids to segregate correctly at mitosis.
● Consists of CDE-I, CDE-II and CDE-III.
● Cse4: Histone H3 variant
● This protein structure enables binding to the microtubules.

Telomeres
Telomeres are required for the stability of the chromosome ends.
The typical sequence of a telomere is: (T/A)1–4 G>2

Telomeres promote pairing, synapsis and recombination during


meiosis via links to the cytoskeleton through nuclear envelope
proteins.

The protein TRF2 catalyzes a reaction in which the 3′ repeating


unit of the G+T-rich strand forms a loop by displacing its
homolog in an upstream region of the telomere.

Telomerases are enzymes that synthesize telomeres.


● It has its own template and uses the 3’ OH of the telomere
to add more nucleotides.
● Uses a reverse transcriptase to prevent the end-replication
problem.
Telomeres are essential for survival due to the end-replication
problem, in which telomerases are inactive in the cell.
● Short telomeres induce senescence, which causes cells to
stop dividing.
● Unequal homologous recombination, which is normally
suppressed at telomeres, can also restore telomeres.

Chapter 8
Nucleosome: The basic structural subunit of chromatin,
consisting of about 200 bp of DNA and an octamer of histone
proteins.
● The octamer consists of 2 copies of each core histone,
which are H2A, H2B, H3 and H4.
● H2A and H2B form dimers and these dimers associate with
the tetramer formed by H3 and H4 .
2 2

● All core histones have the structural motif of a histone fold.

Histone tails: Flexible amino- or carboxy-terminal regions of


the core histones that extend beyond the surface of the
nucleosome.
● Histone tails are sites of extensive post translational
modification.

10 nm fiber: A linear array of nucleosomes generated by


unfolding from the natural condition of chromatin.
● DNA winds around histones to form “beads”.
● Nucleosomes are strung together like beads on a string by
linker DNA.
○ Linker DNA: segments of DNA that connect
nucleosomes.
● It is the primary structure of chromatin.

Linker histones: a family of histones such as H1 that are not


components of the nucleosome core.
● Linker histones bind to nucleosomes or linker DNA and
promote the formation of 30 nm fiber.

30 nm fiber: a coil of nucleosomes that is the basic level of


organization of nucleosomes in chromatin.
● Interactions between nucleosomes cause the 10 nm fiber to
coil or fold into the 30 nm fiber.
● A common secondary structure of chromatin.
● Its formation is also favored by H1, histone tails and
increased ionic strength.

Nonhistone: any structural protein found in chromosomes that


is not a histone.

300 nm fiber: the 30-nm fiber forms looped domains that attach
to proteins.

Metaphase chromosome: the looped domains of the 300 nm


fiber coil further. The width of a chromatid is 700 nm.
● The overall path is: nucleosome → 10 nm fiber → 30 nm
fiber → 300 nm fiber → metaphase chromosome

MNase (micrococcal nuclease): an enzyme that cleaves linker


DNA and cleaves individual nucleosomes from chromatin.
● More than 95% of the DNA is recovered in nucleosomes or
multimers when MNase cleaves DNA in chromatin.
● The length of DNA per nucleosome varies for individual
tissues or species in a range from 154 to 260 bp.
● Depending on its susceptibility to MNase, DNA can be
divided into core DNA and linker DNA.
○ If susceptible, it is linker DNA. Otherwise, it is core
DNA.

H1 histone: the histone protein associated with linker DNA at


the point where DNA enters or exits the nucleosome.

Histones are modified by methylation, acetylation,


phosphorylation, ubiquitination, sumoylation, ADP-ribosylation
and other modifications.

Histone code hypothesis: combinations of specific histone


modifications define the function of the local regions of
chromatin. The post-translational modifications to histone
proteins determine the transcription of DNA that surrounds
them.
● The histone code is read by a “reader” protein. There is a
code-reader complex, and the reader proteins bind to
specific histone modifications. The binding of the code-
reader attracts other components and forms a protein
complex with several binding sites and catalytic activities.
● A histone modifying enzyme (writer) forms a complex
with a reader protein, and the writer-reader complex plus
ATP-dependent chromatin remodeling complex can
spread changes along the chromatin. This leads to
condensing to be spread.
● Barrier DNA sequences: they block the spread of reader-
writer complexes and separate neighboring chromatin
domains.

Acetylation of nucleosomes activates the gene.

Histone variants
● All core histones except H4 have families of variants that
are closely related or divergent from each other.
● Different variants serve different functions in the cell.

While histone octamers are not preserved during replication,


H2A dimers, H2B dimers and H3-H4 tetramers are.

Nucleosome assembly
● During nucleosome assembly, first the H3-H4 tetramer
binds to the DNA. Then, H2A and H2B dimers are added to
form the complete octamer.
● Accessory proteins are required to assist the assembly of
nucleosomes.
○ CAF-1 and ASF1 are linked to the replication
machinery.
○ HIRA and H3.3 are replication-independent accessory
proteins.

Nucleosomes are present in specific sites in DNA.


● Indirect end labeling: A technique for examining the
organization of DNA by making a cut at a specific site and
identifying all fragments containing the sequence adjacent
to one side of the cut.

Most transcribed genes are organized in nucleosomes. RNA


polymerase displaces nucleosomes momentarily, but octamers
reassociate with DNA immediately after.
● There are some exceptions of heavily-transcribed genes that
don’t have histones at all.
● Both the displacement of the nucleosome and its
reassembly requires external factors.

Hypersensitive sites: has increased sensitivity to DNase I


activity.
● They are generated by the binding of factors that exclude
histone octamers.
● They are found in promoters of expressed genes, origin of
replication and centromeres.

Topologically associated domains (TADs)


● Main organization of mammalian chromosomes, about 1
Mb in size.
● Loci within a TAD interact frequently with each other, but
loci in adjacent TADs don’t.
● TAD organization is fairly stable.
● Boundary regions between TADs contain insulator
elements that prevent any activating or inhibiting effect of
enhancers and silencers to pass among different TADs.
These insulators usually contain hypersensitive sites.
○ Insulators define transcriptionally independent sites.
○ Different insulators are bound by different factors and
may use different mechanisms to block the enhancers
and other control elements.

Chapter 27, 28
Heterochromatin nucleation
● Caused by proteins binding to specific sequences, and the
inactivation is spread throughout the genome.
● The genes within heterochromatin are inactivated.
● The length of the inactive region changes from cell to cell.
● Position effect variegation: in some cells, some genes are
silenced because they are juxtaposed with heterochromatin.

HP1: a key protein in mammals that promote heterochromatin


formation.
● Binds to methylated H3 and leads to higher-order chromatin
formation. This causes HP1 to self-aggregate and many
HP1s are now bound to the methylated H3s in the
chromatin.
● It contains a chromo domain and a chromoshadow domain.

RNAi pathways promote heterochromatin formation at


centromeres.

CpG islands are susceptible to methylation


Most methyl groups in DNA are found on cytosine on both
strands of CpG doublet.
Replication converts a fully methylated site to a hemimethylated
site.
● Dnmt1: an enzyme that recognizes only hemimethylated
sites.
● Maintenance methyltransferase: the enzyme that converts
hemimethylated sites to fully methylated sites.

DNA methyltransferase: an enzyme adds a methyl group to a


specific target sequence in DNA.

Demethylase: an enzyme that removes a methyl group from the


target molecule
● TET proteins: convert 5-methylcytosine to 5-
hydroxymethylcytosine to initiate demethylation.

de novo methyltransferase: An enzyme that adds a methyl


group to an unmethylated target sequence on DNA.

Perpetuation methylase

Epigenetic modifications can be inherited, called


transgenerational epigenetics, but it can be preventable as well.
● Acetylated histones are conserved and distributed at random
to the daughter chromatin fibers at replication.

X and Y chromosomes
They have pseudoautosomal regions (called PARs) and they pair
up as homologous regions and undergo crossover during
meiosis.
● PAR mutations are associated with infertility and mental
disorders.

PAR1 is the major pseudoautosomal region and it is found in the


tips of the short arm of chromosomes X and Y.
● PAR1 has a required crossover for successful sperm
formation. So, it undergoes crossing over similar to how
autosomes do crossing over.
● PAR1 region does not undergo inactivation.

PAR2 is another region of homology, but pairing and crossing


over is not obligatory at this region.

X chromosomes are similar to autosomes, they carry many


genes unrelated to sexual characteristics.

Dosage compensation: Mechanisms employed to compensate


for the discrepancy between the presence of two X
chromosomes in one sex but only one X chromosome in the
other sex.

Dosage compensation has different mechanisms for different


species:
● X-inactivation in mammals
● Drosophila: acetylation in H4 increases transcriptional
activity of X in males so that its transcription is equivalent
to two X chromosomes in females.
● C elegans: in females, the complex decreases the level of
transcription of each X chromosome in half

In mammals, one of the X chromosomes is randomly inactivated


during embryonic development. The Lyon hypothesis says that
they do this at the blastocyst stage when there are 200-400 cells.

● Single X hypothesis also says the same thing.


● The inactive X condenses into a Barr body.
● If a female is heterozygous for a particular gene located on
the X chromosome, she will be a mosaic for that character.
○ Example: calico cats
n-1 rule: Where there are more than two X chromosomes, all
but one are inactivated

Mechanism of X inactivation in mammals:


● XIC (X inactivation center) synthesizes a non-translated
RNA gene, Xist.
○ When XIC is mutated, X inactivation abolishes.
○ If XIC is translocated to an autosome, it can inactivate
the autosome.
○ It is necessary and sufficient to cause X inactivation.
● Xist is a gene in the XIC region that is expressed only in the
inactivated X.
○ It makes the inactivated X chromosome inaccessible
and decorates the inactivated X chromosome.
○ We can stain for Xist and X chromosome on FISH.
○ Xist has 8 exons but encodes only structural mRNA.
● XIC also contains binding sites for different proteins.

Some genes on X are not inactivated:


● PAR1 and PAR2 in pseudoautosomal regions.
● They have very little Xist bound and don't have typical Xi
modifications. Hence, transcription can still continue.

Approximately 15% of genes escape from complete


transcriptional silencing in the inactivated X chromosome.
● Most are in the short arm.
● 10% of genes show variable patterns of expression.

Xist is not always expressed. It is expressed initially to


inactivate X, but then it is silenced. However, histone
modifications that do silencing remain.
● H2AK119ub1
● H3K27me3
● H420me1
Xist-independent maintenance of X-inactivation: if we inhibit
XIC after this stage, the X chromosome remains inhibited.
● MacroH2A
● Hypoacetylation of histone H4

How is only one X chromosome silenced?


● Both chromosomes express the unstable Xist.
● Antisense Tsix RNA is expressed from the future active X.
○ This degrades the Xist.
● The inactive X keeps synthesizing Xist and it wraps the
chromosome eventually.
● Active Xist recruits the Polycomb complexes.

X-linked inheritance
● The affected individuals are almost exclusively male.
○ Females usually have a WT allele unless the partners
are related.
● Affected males reproduce normal sons.
● A woman whose father was affected has normal sons and
affected sons in a 1:1 ratio. The woman herself is not
affected.

Chromosome condensation is caused by condensins


● Occurs in mitosis
● Structural maintenance of chromosome (SMC) proteins
○ ATPases
○ Two versions: condensins and cohesins
○ Cohesins keep two chromosomes together until
anaphase.
● APC degrades cohesins so that two chromatids can separate
at anaphase.
● Condensin introduces supercoils to DNA and causes tightly
coiling.
● In C. elegans, condensins are responsible for condensing
inactive X chromosomes.

DNA imprinting:
● If the DNA gets methylated, it can be maintained through
the generations, which is called imprinting.
● Methylation usually makes genes inactive.
● In paternal imprinting: if the father is imprinted and the
mother has a mutated allele, then the daughter is also
diseased.
○ Half of the next generation will be healthy.

Y-chromosome
SRY evolved as a sex-determining mechanism after proto X and
proto Y being essentially identical.
● Muller’s ratchet: without crossover, mutations accumulated
and inactivated most genes.
● After SRY evolved through the accumulation of mutations,
the region of X-Y recombination became more restricted as
the only regions of homology became the telomeric
regions.

Like mtDNA, the Y chromosome is used to trace human


evolutionary history. Y is subject to less mutation compared to
the autosomes.
● We can create haplotypes from the Y chromosome.
● Simple sequence repeats are providing polymorphisms.

HY antigen: cell surface antigen marking male cells only.

If SRY is mutated, then no male traits are depicted, and XY


females emerge.
Males have XX chromosomal combination as well.
● If XX mice embryos are injected with purified SRY gene,
they become male but sterile.

SRY is a single exon that acts as a transcription factor.

You might also like