phylogenetics | BacPathGenomics

I just came across this very cool visual dictionary of tree visualisation methods – treevis, http://www.informatik.uni-rostock.de/~hs162/treeposter/poster.html, which made me think about the language of phylogenetic trees. It’s interesting to think how our ideas about phylogenetic inference, and the ways in which we interpret trees, are influenced by the way the tree is represented.

Whether studying bacterial populations or bacterial communities, we tend to use phylogenetics to breakdown, comprehend and represent the relationships between bacterial genomes. A basic phylogenetic tree can capture so much information… a great example is the recent Nature paper on the 7th Cholera pandemic, out of the Sanger Institute (see pubmed entry, the paper itself is behind the NPG paywall). The phylogenetic tree structure showing relationships between Vibrio cholera strains (obtained of course by whole genome sequencing with Illumina), together with the time and location that each strain was isolated, reveals an incredible amount of detail about how the pandemic has spread around the world over the last few decades.

But interpreting trees can be difficult. And the way the trees are represented can make them more or less difficult to interpret. As a simple example, I’m often surprised how many people are unaware that these two trees are just alternative representations of the same structure:

(If you don’t believe me, copy the tree structure below into a text file and open it up in a tree viewer like DendroScope or FigTree and click around the different representations.)

((A:0,B:0):0.2,(C:0.5,((J:0,K:0):2,(D:8,(E:12,(F:5,(G:5,(H:0.5,I:0):3):1):3):2):5):2):0.2);

In the unrooted tree on the right, it’s easy to see for example that E is about equidistant from all other leaves on the tree. But from the (randomly) rooted tree on the left, this is less apparent and requires some thinking about… many people interpret this as E being closer to F, G, H & I than to the other points. Granted, a proper rooting of the dendrogram on the left would help the situation, but still it’s a good example of how the visual representation makes interpretation more or less intuitive.

Many of the representations in treevis were created by computer scientists for purposes entirely unrelated to phylogenetics, so it would probably take a bit of effort to apply them to your favourite phylogeny…but could be worth it depending on what you are trying to convey.

An easy option for overlaying annotations of all kinds onto traditional phylogenetic tree structures (rectangular, circular and unrooted) is iTOL, the interactive tree of life. It’s a handy webtool where you can upload your tree file in newick format (like the one given above) or nexus format, plus some text files of annotations for nodes or leaves, and display the annotations overlaid on the tree in all kinds of cool ways. These are the examples in the iTOL paper in Nucleic Acids Research’s web server issue (open access), or just see the iTOL website for loads more examples and to try it out.

(Figure 1 from “Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy”, Ivica Letunic & Peer Bork, NAR 39 (S2):W475-W478)

BacPathGenomics

Genomics and evolution of bacterial pathogens

Tag Archives: phylogenetics

Visualising trees

Workshop materials – phylogenetics and evolutionary analysis