Scientific breakthroughs have profound effects on how we view and treat the world that surrounds us. Especially in the microscopic level, scientific discoveries provide the only understanding we have of Nature. One such case is the story of our relationship with microorganisms.

First impressions

Since Luis Pasteur’s discoveries on invisible “infectious agents” more than 150 years ago, we all suffer to some extent from germophobia. Until today, for the majority, microbes are synonymous to diseases. From cleaning products in every household that claim to kill 99.9% of bacteria, to the ever-present sterilizing hand lotions that pop out of purses and welcome us in public buildings, we are at war against microbes.

A misguided war

Our declaration of war against these microscopic organisms has however forgotten to mention that our civilization, even our own existence, desperately depends on them. We rely on microorganisms for culturally important – even luxurious – fermentation products such as wine, cheese and chocolate. While eagerly consuming fashionable pro-biotic foods, not many realize that the practice is actually nurturing the balanced communities of more than 1,000 species of “good bacteria” that live in our guts harmoniously, adding to the total one peta- (that is one followed by 15 zeros) of microbes in our body. Yikes?

Microscopic universe

Let’s take a step back. Life on earth appeared 3.5 billion years ago, in the form of primitive, non-nucleated microbes and has since evolved from and around these invisible critters. Ancient single-celled organisms invented photosynthesis and passed this gift on to plants, the basis of our food chain. Some don’t even need photosynthesis as they evolved ways to feed on inorganic materials, such as sulfur, or, indeed, shoot up straight electrons, thriving in the most hostile environments. They can even survive in the vacuum of space. Microbes live everywhere around and inside us, outnumbering our own cells in our own body and performing important ecological functions that tailor the environment to our comfort zone. I hope I’ve started to make a case for their importance.

Microbial dark matter

With all their understood abundance and significance, science has only scratched the surface of studying this varied class of organisms. The main reason being that, despite their omni-presence, very few species can be cultivated and isolated in the laboratory, lending themselves to science.


The only way around this problem of unculturable microorganisms has been the detection of subtle differences in the conserved rRNA genes (see Exploring The Frontiers: An Introduction to Metagenomics). Relatively simple in its scope (this method is good for counting and classifying the microorganisms in a community, but unable to zoom into their exotic biochemistries), rRNA sequencing allowed, for the first time, a good look at the microbial dark matter, giving rise to the field of metagenomics.

The second breakthrough

Useful as this approach has been, scientists’ curiosity only got bigger.  Now we can catch a glimpse of this invisible universe, but what secrets can we learn about the distinct biology that drives these exotic ‘species’? How has nature used the entire palette of DNA sequence combinations in the evolution of such diversity?

Shotgun genome sequencing in metagenomics

It wasn’t too long until metagenomics recognized the potential benefits of reading through the entire genomes of these microbial mega-communities and not just a couple of genes. With our current understanding about genes and genome organization, we have the ability to reverse-engineer the knowledge we obtain from sequencing the entire genomes of these organisms, unraveling the nuts and bolts of their existence.

The nitty-gritty of metagenomic shotgun sequencing

For those familiar with NGS, the first step for sequencing these diverse populations is straightforward: DNA is isolated and a library of the entire sample of microbial populations is constructed and sequenced using standard protocols. The tricky part comes after sequencing, when this mash-up of short sequences needs to be assembled back into unique genomes, each corresponding to the organism from which it originated.

In the mixture of sequenced DNA fragments, each short sequenced fragment has to be computationally assembled into longer, species-specific contigs. This step relies on the fact that the sequenced genome fragments will have identical, overlapping ends when they originate from the same chromosomal DNA. Assembling these overlapping fragments into contigs is like fitting together the pieces of a single puzzle among millions of other pieces. Genomic features such as sequence similarity to closely related species that can be used as a scaffold, characteristic GC-content and gene density further aid in grouping these contigs as belonging to the same ‘species’.

A changing view of the microbial world

With the introduction of genome shotgun sequencing in metagenomic research we have been able to probe the underlying biochemistry that allows microbial communities to survive extreme and hostile environments as well as their role in large biogeochemical cycles. We discovered clear signs of extended horizontal gene transfer across different microbial ‘species’ and other hybrid metabolic characteristics that blur the lines between traditionally well-defined taxonomic groups.

A friend and not a foe

Zoom in to our own microbial flora and you will realize that microbes are much more of a friend than a foe; for instance it turns out that every healthy individual has one of three characteristic microbial populations in their intestine (called enterotypes). These microbes rely on us for their energy requirements and in return produce for us essential vitamins. Disturbances in the natural, balanced microbial communities within our bodies are linked to Crohn’s disease, colon cancer and asthma, among others. The type of bacteria that coexist in our intestines is influenced by our body mass index (BMI) but it also affects our propensity to obesity. And we have only scratched the surface of our dependence on microbes!

Persisting problems…

The main problem researchers currently face when embarking into shotgun genome sequencing experiments is how to correctly assemble the mix of sequenced fragments into a single genome. Contigs are not even remotely reaching the length of whole chromosomes, and homologies between genomic regions of different sequenced microorganisms can undermine the assembly process. Moreover, organizing different contigs into organism-specific groups is not always trivial, and the presence of plasmids in many of these organisms only complicates the process further.

…and inventive solutions

There is a multitude of computational and statistical approaches that aim to solve the problems associated with metagenomic shotgun sequencing. A more head-on solution comes by first sorting the microbial sample into single cells and then sequencing each genome separately. But, given the enormous population sizes involved and the variety of taxa within them, this method can quickly become cumbersome and financially prohibitive. The most promising, and perhaps inventive, solution to the problem of correctly assembling the individual genomes in metagenomic populations is using higher-order chromatin structure information (Hi-C see Tower of Babel: Next Generation Sequencing Provides New Insights on Chromosome Construction).

Hi-C assisted metagenomics

The groups of Dr Jonathan Eisen and Dr Maitreya Dunham and Dr Jay Shendure are using the contact-point information between and within chromosomes to study the composition of synthetic metagenomic populations. These groups utilize the fact that genomic regions that are close to each other on the chromosome tend to have higher frequency of Hi-C linkage points, to organize the contigs into large sub-chromosomal fragments. Moreover, because cross-linking of the chromatin in these experiments takes place before cell lysis, Hi-C contact maps of metagenomics populations inform on which pieces of DNA are packed together in the same cell, belonging to the same ‘species’.


The computational pipeline first aligns in parallel regular, shotgun sequencing reads and the Hi-C reads. After the shotgun reads are assembled into contigs, Hi-C linkage information allows the statistically significant assembly of entire chromosomes and genomes from them. Using this approach the Eisen group reported for the first time the correct grouping of plasmids with their corresponding microbial genome.

A bright future for microbial dark matter

Currently, this approach has only been tested in a population of at most eighteen microbial ‘species’. It does however have the capacity to handle much more complex communities, even communities comprising of extremely diverse microorganisms. The Dunham and Shendure groups were able to carry out their experiments on a mixture of microorganisms from all three kingdoms of life: bacteria, archaea and eukaryotes. “There are many ecosystems in which eukaryotes and prokaryotes are present. Just a few examples: the human microbiome, microbial communities associated with plants, and, one of my personal favorites as a yeast geneticist, fermentation samples”, explains Dr Dunham.

Lessons from science

Metagenomics is a relatively new field. As we discover more about the lifestyles of these humble organisms we come to better appreciate the immense inventiveness and resilience of life. More than that, we realize how dependent we are on these invisible allies.

Microbiology of the 19th century identified that illness has specific causes and can therefore be prevented. Perhaps the lesson that 21st century science teaches us is not to destroy something we barely understand.