Introduction to DNA Template Design
Process of DNA replication. Beginning with the attachment of enzymes at the origin point of the unwound double helix (as marked by one or more RNA primers) dNTPs are added in the 5' to 3' direction, with the leading strand synthesis only requiring a single primer. The replication fork continuously synthesizes the 'leading strand', whereas it permits the 'lagging strands', more specifically Okazaki fragments, to be synthesized in small segments. After primer removal, DNA ligase connects these segments, resulting in two identical copies of the DNA, with the parent strand split along with two replicated strands. Figure made in BioRender.
DNA replication is the synthesis of new DNA, complementary to both strands of the parental nucleotide. Each new strand of the double-helical DNA runs in opposite, or antiparallel, directions. This means that the synthesis of one strand can be synthesized in the 5' to 3' direction (the leading strand) while the other, in theory, is synthesized in the opposite direction (the lagging strand).
However, we know that DNA polymerase catalyzes the polymerization of dNTPs only in the 5' to 3' direction, so how does this work? In reality, the lagging strand is formed from small, discontinuous, pieces of DNA that are synthesized backward with respect to the movement of the replication fork. These small pieces of newly synthesized DNA, termed Okazaki fragments after their discoverer, are joined by the action of DNA ligase to form the intact lagging strand. Once the Okazaki fragments are synthesized, DNA polymerase removes the RNA primers and DNA ligase connects the fragments with phosphodiester linkages. The leading strand, alternatively, is synthesized in a continuous fashion and its elongation occurs in the same direction as the movement of the replication fork.
For leading strand synthesis, a single primer is needed, whereas multiple RNA primers are required for lagging strand synthesis. After initial primer synthesis, the leading strand needs only DNA polymerase for replication to continue. In contrast, the lagging strand needs multiple enzymes in addition to DNA polymerase, including both RNase H and ligase.
Datasets: |
DNA Design
For an efficient DNA replication experiment, it is important to keep the design of the DNA template in mind as certain nucleotide or amino acid sequences have the potential to reduce the yield of the replicated protein of interest. In DNA replication, cells decode mRNAs by reading nucleotide groups in sets of threes, known as codons. Codons specify each amino acid and read in the 5' to 3' direction from the N-terminus to the C-terminus of a protein. For a specific protein, one 'start' codon (AUG) marks the beginning and three 'stop' codons mark the end of the protein. It is recommended to design the template DNA for specific translation from which the DNA is derived. For example, if ribosomes, tRNAs, and translation factors are derived from Escherichia coli, incorporated codons should be appropriate for E. coli replication. In particular, for E. coli, only the CTG codon may be used for leucine, and the most common start codons for known E. coli genes are AUG, GUG, and UUG.
The residue content just after the start codon is especially important to ensure correct DNA replication:
- Codons should be chosen to reduce glycine/cytosine (GC) content at the beginning of genes (e.g., usually up to the 6th codon) where it is beneficial to have a higher level of adenine/thymine (AT) residues.
- It is also important to design a DNA so that rigid secondary structures do not form in the region around the start codon since this portion regulates the translation efficiency of the protein.
- If the mRNA forms a rigid secondary structure around the start codon, binding of mRNA to the ribosome may be impaired thereby decreasing the yield of replicated DNA.
In recent years, there are many open-source mathematical models and computer software that can generate predictions for secondary structures in the protein.
It is also important to know that the incorporation of some residues into a DNA sequence has the potential to destabilize the protein and inhibit replication. Protein and glycine residues, for example, are frequently found in turn and loop structures of proteins and thought to play a role during chain compaction in folding. Proline and glycine have been shown to disrupt the regular secondary structure of proteins, particularly in ɑ-helices and β-sheets where they may break stabile, normal, hydrogen bonding patterns.
Note: For this reason, it is important to avoid using proline and/or glycine in the region after the start codon.
More commonly aspartic acid, glutamic acid, valine, alanine, tyrosine, and/or leucine should be used in this area of the DNA template. It is also important to choose sequences that avoid the potential for frameshift mutations. Frameshift mutations, like the insertion or deletion of a nucleotide, may result in the alteration of the reading frame that will completely change the amino acid sequence.
Products
Table 1. Deoxynucleotides (dNTPs) for use in PCR, real-time PCR, and reverse transcription PCR
References
DNA Replication
Lagging Strand Synthesis
Efficient translation initiation dictates codon usage at gene start
A periodic pattern of mRNA secondary structure created by the genetic code
Original created on February 14, 2024, last updated on February 14, 2024
Tagged under: DNA