Genetic markers

Genotyping is an important goal in studying an attribute of interest in genetics. It is used for distinguishing between the genotypes relevant to the variant forms of the trait. These variant forms of the trait are also called allelic forms. Genetic markers are harbingers of information on allelic variation at a locus and depend on the mutation process underlying their creation. The three classes of genetic markers available today are Allozymes, DNA polymorphisms and DNA repeats. With
the advent of cost effective DNA sequencing methods the amount of human genetic variation catalogued is immense.

Genetic markers and the field of anthropological genetics

Genetic markers are genetic entities segregating independently and used to classify populations by their presence, absence or differences in frequency among populations (Crawford, 1973).Genetic markers are used to quantify genetic diversity in populations that has resulted due to the interplaying of evolutionary processes. The two discoveries of the blood group system
(Landsteiner, 1900) and that of protein electrophoresis (Smithies, 1955) gave the impetus for genetic markers research. Subsequently other advancements in methodologies happened ranging for isolation of the genetic material to amplification of selected genomic regions and reading the composition and sequence of the genome. All these factors brought in unparalleled flow of information on genetic diversity among human populations described by genetic markers.

Genetic markers found application in studying population structure and history, selection and admixture mapping. This module discusses the different categories of genetic markers that have been widely used to answer queries in anthropological research and their applications in evolutionary genetic studies. Before delving into the molecular markers a background of the
classical markers is however important.

Allozymes: The first genetic markers

The term “Allozymes” was obtained from the phrase allelic variants of enzymes. Protein variants of enzymes can be distinguished by separating them in gel electrophoresis according to size and charge differences caused by amino acid substitutions. This is the working principle of Allozymes markers. The bands showing Allozymes variants were visualized by treating the gels with enzyme specific stains that comprised ligands which acted as substrate for the enzyme, enzyme co-factors and oxidized salt. The use of classical markers to for studying genetic variation cannot be underestimated. The reason is that the observations from this group of markers on human populations showed immense within population polymorphism which led to the devising of the neutral theory of evolution. The setback that was suffered by this system of genetic markers is the small number of informative loci that could be studied. It had the merit of cost effectiveness and hence was used for genetic mapping of traits and in association studies till DNA sequencing became a reality.

Introduction to molecular markers

The Allozymes markers had the demerit of being indirect way of studying DNA variation and were replaced by direct molecular markers as a consequence of the human genome project. The human genome project generated a human genome reference sequence. This was followed by the complete genome sequencing of individuals from different ethnic backgrounds (Frazer, 2009). The individual sequences varied from the reference sequence at many different locations on the genome. This unleashed the information about different classes of genetic variations. Human genetic variants broadly fall into two categories: single nucleotide variants and structural variants. The classes differ on the basis of nucleotide composition though there is no standard differentiation.

Detection methods:

Before elaborating on the types of markers the different methods available are enumerated in this section:

  • a. DNA sequencing: It is the most precise method available for variant determination. It allows the detection of sequence variants in multiple sequences for multiple individuals.
  • b. PCR RFLP: PCR fragments are digested with a restriction enzyme and the fragment sizes determined by gel electrophoresis. Base pair substitutions at the restriction site lead to changes in the pattern of restriction fragments on the gel.
  • c. Denaturing gradient gel electrophoresis: A double stranded DNA fragment obtained from PCR is made to migrate through a gradient of denaturing solvents. As the fragment migrates it gets denatured leading to a conformational change. The mobility of the fragment is reduced and it reaches a sequence specific position in the gel. The PCR fragments which are differing in sequence are characterized by specific denaturation conditions.
  • d. Temperature gradient gel electrophoresis: The technique is same as that of the denaturing gradient gel only except for the denaturing solvents a temperature gradient is used for denaturation.
  • e. Single strand conformation polymorphism (SSCP): The working principle of SSCP is the difference in electrophoretic mobility of secondary structures formed by single stranded DNA fragments.
  • f. Heteroduplex analysis: Heteroduplex DNA molecules are formed by intertwining of complementary DNA strands differing at a single (or few) bases. Heteroduplex analysis involves assaying differences in the electrophoretic mobility among heteroduplexes and homoduplexes.

Types of polymorphisms

a. Single-nucleotide polymorphisms (SNPs): The most prevalent of polymorphisms occurs by a single base substitution and is called a single nucleotide polymorphism (SNP). In most cases, a SNP (pronounced snip) has two alternative forms (alleles) and is a result of a transition (purine to purine or pyrimidine to pyrimidine) mutation or a transversion (purine to pyrimidines) mutation. Insertion/deletions (In Del) of a single base or two bases are also common. When a SNP lies in the recognition site of a restriction enzyme, it is called a restriction fragment length polymorphism (RFLP) or a restriction site polymorphism (RSP) as the presence of the mutant allele activates the enzyme and results in formation of restriction fragments. The first genetic maps obtained by using DNA polymorphisms were based on RFLP markers. The SNPs are catalogued in the dbSNP of NCBI and designated with a reference SNP (rs) ID. The database is regularly updated as novel polymorphisms are added that are population-specific. About 10 million SNPs have been catalogued in the dbSNP of NCBI for use in genotyping platforms. Most of the dbSNP entries are in dels (2 of 12million entries) (Strachan & Read, 2003). SNPs can either produce an alteration in coding of an amino acid (non-synonymous substitution) or code for the same amino acid (synonymous substitution).
b. Minisatellites : Minisatellites are markers composed of tandem repeats of DNA sequences with repeats of length 6- 60 base pairs. The polymorphisms in minisatellites result from unequal crossing over or gene conversion events. The genomic DNA is first digested using restriction enzymes following which DNA probes containing a complementary mini satellite sequence are allowed to hybridize with the fragmented DNA. The minisatellites are highly polymorphic markers due to their length and this
characteristic has found them application in DNA fingerprinting. But their use is limited to forensic investigations and paternity analysis and this group of markers is not widely used in population genetic analysis.
c. Short-tandem repeat polymorphisms (STRPs or microsatellites): About 50% of the human genome consists of repeat sequences that are both interspersed and in tandem. Satellites containing tandem repeats which are 1-6 nucleotides long are called microsatellites. This group of markers form the first PCR based markers. They are highly polymorphic and widely distributed in the euchromatic part of the genome. These markers are popular in investigations directed at mapping, paternity analysis and population genetics. Microsatellites are formed by replication slippage in which the DNA polymerase responsible for replication of DNA slips and repeats the replication of previous sequences. The markers however has complex mutation patterns and create PCR artifacts complicating band scoring after gel electrophoresis. While SNPs are molecular events that have remained stable over evolutionary time, tandem repeats are relatively recent (Strachan and Read, 2003). The interspersed repeat sequences are called short interspersed nuclear elements (SINEs) that are 100-300 base pairs long or the 6-8 kilo bases long interspersed nuclear elements (LINEs). While the satellite markers have been at the forefront of family and forensic studies, the SINEs have been widely used in studying human genome diversity. The Alu family of SINE class has been estimated to be 500,000 copies in humans and is only found in primates.

d. Copy Number Variations (CNVs): These are sub-microscopic structural variations on chromosomes stretching to more than 1kb. They vary in the number of copies among different individuals. Although large in size, CNVs are not always pathogenic. This class of variations comprises intermediate sized insertions, deletions and inversions and large (≥50kb segments) copy number variations (CNVs) (Tuzun et al, 2005). CNVs are due to occurrence of identical or nearly identical sequences of length 1kb or larger (Feuk& Scherer, 2006), in some chromosomes (Frazer, 2009).Though SNPs are more common, CNVs account for the greatest number of nucleotides (more than 70% of variant bases; Frazer, 2009) that differ between two genomes. The CNVs constitute 0.5-1% of the genome of an individual and so act profoundly in evolution of genome and health. A CNV that occurs in more than 1% of the population is called a Copy Number polymorphism. An inversion forms when a segment of DNA is reversed in orientation with respect to the rest of the genome. A change in the position of a chromosomal segment within a genome in the same or a different chromosome, keeping the DNA content unchanged, leads to a translocation. The application of CNVs to study population history has been explored less in comparison to SNPs and microsatellites.

Scroll to Top