Genetic Notation: Crack the Code!

Pop Quiz Time: You get a new bacterial strain from a culture collection, but you’re not quite sure what the genetic notation (i.e., all the letters and symbols) means. Do you:

A. Cry?

B. Ask around to see what your lab mates think?

C. Cross your fingers that your friends at Bitesize Bio can help you out?

Image Larger Volumes with the UltraMicroscope Choros™

From: Miltenyi Biotech

Trust Your Quantification with the DeNovix DS-8X Rapid Eight Channel, 1µL UV-Vis Spectrophotometer

From: DeNovix

Well, I hope you chose answer “C”, because that’s exactly what this article is all about!

Thankfully, there is a standard nomenclature for bacterial genes dating back to 1966.¹ However, there are a lot (and I do mean A LOT) of rules for naming (and thus reading) genetic and phenotypic mutations. This makes sense given all the possible gene and loci alterations scientists can introduce! For the sake of simplicity, this article will focus on the most common types of genetic notation that we meet as biologists. You will also get some useful tips that can help you to decipher what it is you are reading. So let’s roll up our sleeves and dig in!

Tip #1: Let the Basics Be Your Guide

If you come across a strange 3-letter abbreviation in your strain name, have no fear! This notation simply exists to designate a gene of interest (i.e., one that has been mutated or inserted during the generation of your strain). Each gene is assigned a lower case 3-letter designation that is usually an abbreviation for the pathway affected or phenotype resulting from the mutation/insertion. If you’re ever confused on what the abbreviation means, check out the HUGO Gene Nomenclature Committee website. To start you off, we have listed some common examples in Table 1 below:²Table 1. Common gene abbreviations

Biosynthetic genes
ala	alanine
arg	arginine
asn	asparagine
gua	guanine
pur	purines
pyr	pyrimidine
thy	thymine
bio	biotin
nad	NAD
pan	panthothenic acid
Catabolic genes
ara	arabinaose
gal	galactose
lac	lactose
mal	maltose
man	mannose
mel	melibiose
rha	rhamnose
xyl	xylose

If there are different genes that affect the same pathway, they are delineated by a capital letter following the 3-letter designation. For example, mutations affecting pyrimidine biosynthesis are designated pyr; the pyrC gene encodes the enzyme dihydroorotase and the pyrD gene encodes the enzyme dihydroorotate dehydrogenase.
If several mutations are introduced into a pathway, each is consecutively assigned a unique allele number. For example, pyrC19 refers to a particular pyr mutation that affects the pyrC In order to distinguish each mutation, no other pyr mutation, regardless of the gene affected, will be assigned the allele number 19. A separate series of allele numbers is used for each three-letter locus designation.

Tip #2: Amino Acid Mutations Are a Thing

As shown in Table 1, amino acids are often targets of genetic mutations. Given that they are the building blocks of proteins, amino acid mutations make complete sense when you’re trying to alter a specific phenotype. Now, remember when your biochemistry professor had you memorize the single letter abbreviations for all the amino acids? This is when you get to use that information! Let’s say that there is a genetic point mutation resulting in alanine (A) at position 235 where threonine (T) used to be. This would simply be noted as “T235A”. Easy peezy.

Tip #3: Go with the Most Obvious Answer

Every so often biologists come across naming schematics that actually make sense! Specifically, this can occur when the actual protein that a gene encodes is known, and can thus become part of the genetic name. For example:

rpoA encodes the ?-subunit of RNA polymerase
rpoB encodes the ?-subunit of RNA polymerase
polA encodes DNA polymerase I
polC encodes DNA polymerase III
rpsL encodes ribosomal protein, small S12

Seems pretty straightforward, doesn’t it? Great! Now, what about if your mutation is actually the result of an insertion? Well, reading that naming scheme can get a little tricky depending on the exact location of the insertion. Read on!

Tip #4: Break out the Rosetta Stone

In addition to understanding 3-letter abbreviations, one of the most confusing aspects of genetic notation are the symbols. Some make perfect sense, like “+” for wild type and “-“ for mutation; others not so much. For now, Table 2 can serve as a cheat sheet for the most commonly used genetic notation symbols.

Table 2. Common symbols used in genetic notation

+	wildtype
=	Identical to reference sequence (no change, wild type sequence)
?	Unknown
/	Mosaic cases; separator between the difference nucleotides, transcripts, and proteins generated from one allele
//	Chimeric cases; separator between different nucleotides, transcripts, and proteins generated from a mix of four alleles
( )	Indicates uncertainty in the description of a change
0 (zero)	Indicates no product/nothing
–	Mutated gene
*	Translation termination (stop) codon
_	Nucleotide numbering, used to indicate a range
?	Deletion
–	Fusion
:	Fusion
::	Insertion
?	Genetic construct introduced by a two-point cross-over
>	Substitution (for bases)
–	Range
;	Separator between different changes in one allele or between two alleles
,	Separator between different transcripts or proteins generated from one allele
am	Amber mutation
con	Conversion
cs	Cold sensitive
del	Deletion
dup	Duplication
ext	Extension
fsX	Frame shift
ins	Insertion
inv	Inversion
o	Opposite strand
oc	Ochre mutation
R	Resistant
sup	Suppressor
t	Translocation
ts	Temperature sensitive
um	Umber (opal) mutation
X	Stop codon

Tip #5: Chromosomes Rearrangements Are Goofy

Chromosome rearrangements refer to deletions, duplications, and inversions of genes (Table 2). You will recognize these guys as a 3-letter notation, indicating which type of rearrangement you’re dealing with, followed by the corresponding genes in parentheses, and then the allele number:

Deletions = DEL(genes)allele number
Inversions = INV(join point gene #1 – join point gene #2)allele number
Duplications = DUP(gene #1*join point*gene #2)allele number

Tip #6: Biologists Use a Lot of Antibiotics

If you are already familiar with the designation of antibiotic resistance or sensitivity, great! If not, Table 3 below will help you out with abbreviations for the most commonly used antibiotics for developing sensitive or resistant strains.

Table 3. Common antibiotic resistance designations and related terms

Abbreviation	Antibiotic
amp	Ampicillin
azi	Azide
bla	Beta-lactam
cam/cat	Chloramphenicol
gen	Gentamicin
kan	Kanamycin
neo	Neomycin
rif	Rifampicin
spc	Spectinomycin
str	Streptomycin
tet	Tetracycline
topA	Phage T1
zeo	Zeomycin
XG	X-gal
XP	X-phosphate
R	Resistance
S	Sensitivity

Whew! That was A LOT of information! And, as mentioned in the beginning, we have only covered the most common, tip of the iceberg mutations, symbols, and nomenclature that you may come across with the genetic notation of any new strain. If you don’t see your strain here, and you need more help, check out the list of references below. Go forth and decipher!

References and Further Reading

Demerec M, Adelberg EA, Claark AJ, Hartman PE. A proposal for a uniform nomenclature in bacterial genetics. Genetics, 1966; 54(1):61-76.
Birge EA. Bacterial and Bacteriophage Genetics. 2005. 5th Ed. Springer-Verlag New York. DOI: 10.1007/0-387-31489-X.
International Committee on Standardized Genetic Nomenclature for Mice. Guidelines for Nomenclature of Genes, Genetic Markers, Alleles, and Mutations in Mouse and Rat.
Rice University. Genetic nomenclature for Drosophila melanogaster.
American Society for Microbiology. Journal of Bacteriology.
The Arabidopsis Information Resource (TAIR). Arabidosis Nomenclature.
den Dunnen JT, Dalgleish R, Maglott DR, Hart RK, Greenblatt MS, McGowan-Jordan J, et al. HGVS Recommendations for the Description of Sequence Variants: 2016 Update. Hum Mutat. 2016; 37(6):564-9.

Kristen Haberthur

Kristen is a biomedical research scientist by trade with a PhD in Viral Immunology. Enthusiastic science communicator and teacher. Currently adjunct faculty in the Department of Biology at the University of Portland.

About Us

Marketing

Genetic Notation: Crack the Code!

Image Larger Volumes with the UltraMicroscope Choros™

Trust Your Quantification with the DeNovix DS-8X Rapid Eight Channel, 1µL UV-Vis Spectrophotometer

Tip #1: Let the Basics Be Your Guide

Tip #2: Amino Acid Mutations Are a Thing

Tip #3: Go with the Most Obvious Answer

Tip #4: Break out the Rosetta Stone

Tip #5: Chromosomes Rearrangements Are Goofy

Tip #6: Biologists Use a Lot of Antibiotics

References and Further Reading

Alkaline Lysis vs. Boiling Lysis: Selecting the Ideal Plasmid Isolation Method

Starting Up a New Lab: What you Need to Know

Heating up agar? Just add a cup of water and avoid the glitter and crumbs

How to Clean and Calibrate Your Lab Balance

The How and Why of Limit of Detection

A Quick Primer on Enzyme Kinetics

Are You Leaning On a Crutch? Spotting Habits and Routines That Hold You Back

10 Things Every Molecular Biologist Should Know

About Us

Marketing

Genetic Notation: Crack the Code!

Tip #1: Let the Basics Be Your Guide

Tip #2: Amino Acid Mutations Are a Thing

Tip #3: Go with the Most Obvious Answer

Tip #4: Break out the Rosetta Stone

Tip #5: Chromosomes Rearrangements Are Goofy

Tip #6: Biologists Use a Lot of Antibiotics

References and Further Reading

More 'Basic Lab Skills and Know-how' articles

Are You Leaning On a Crutch? Spotting Habits and Routines That Hold You Back

10 Things Every Molecular Biologist Should Know