<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>BacPathGenomics</title>
	<atom:link href="https://bacpathgenomics.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://bacpathgenomics.wordpress.com</link>
	<description>Genomics and evolution of bacterial pathogens</description>
	<lastBuildDate>Sat, 25 May 2013 06:47:30 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='bacpathgenomics.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>https://s2.wp.com/i/buttonw-com.png</url>
		<title>BacPathGenomics</title>
		<link>https://bacpathgenomics.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="https://bacpathgenomics.wordpress.com/osd.xml" title="BacPathGenomics" />
	<atom:link rel='hub' href='https://bacpathgenomics.wordpress.com/?pushpress=hub'/>
		<item>
		<title>SPAdes vs Velvet assemby comparison</title>
		<link>https://bacpathgenomics.wordpress.com/2013/05/25/spades-vs-velvet-assemby-comparison/</link>
		<comments>https://bacpathgenomics.wordpress.com/2013/05/25/spades-vs-velvet-assemby-comparison/#comments</comments>
		<pubDate>Sat, 25 May 2013 06:47:26 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Assembly]]></category>
		<category><![CDATA[assembly software]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=657</guid>
		<description><![CDATA[This is a guest post from Dave Savage, a post-doc working on resistance gene analysis at the University of Melbourne. We have been trying out SPAdes as a replacement for Velvet + Velvet Optimizer which we routinely use for assembling bacterial genomes. The SPAdes paper, and the B-GAGE assembler comparison, show that SPAdes does better &#8230; <a href="https://bacpathgenomics.wordpress.com/2013/05/25/spades-vs-velvet-assemby-comparison/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=657&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>This is a guest post from Dave Savage, a post-doc working on resistance gene analysis at the University of Melbourne.<strong> </strong>We have been trying out SPAdes as a replacement for Velvet + Velvet Optimizer which we routinely use for assembling bacterial genomes. The <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3342519/">SPAdes paper</a>, and the B-GAGE assembler comparison, show that SPAdes does better than most other assemblers on a small set of bacterial read sets, but we routinely assemble hundreds of genomes so needed to see whether this performance holds across large sets of reads with variable coverage etc. The results confirm SPAdes consistently performs well, and better than Velvet. This, and the fact that SPAdes gets you from fastq reads to error-corrected contigs and scaffolds in a single command with no need for tuning parameters, makes it a clear winner for our purposes.</p>
<p><strong>Dave says:</strong></p>
<p>SPAdes (<a href="//bioinf.spbau.ru/spades) is a new(ish) genome assembly program that is particularly suitable for assembly of bacterial genomes. SPAdes makes a number of improvements on the de Bruijn graph algorithms used by assemblers like Velvet, IDBA and SOAPdenovo, including iteration over values of kmer sizes, and incorporation of paired kmers (k-bimers), which allows information from paired end reads to be introduced into the computation at an earlier stage.  In their paper, the authors of SPAdes make comparisons with a number of popular assemblers using single-cell and multi-cell E. coli datasets. Their results show that SPAdes can in fact deliver better assemblies, with a higher N50 length and larger contigs. Since we regularly use Velvet for assembly, we wanted to perform some further comparisons between Velvet and SPAdes on the types of data sets we typically deal with. To this end we ran SPAdes and Velvet on 114 Shigella sonnei paired end Illumina samples, and 67 unpaired Staphylococcus aureus samples. We then calculated the N50, length, L50, the number of contigs &gt; 500 bp, and the total contig length produced by each assembler. The figures below shows the distributions of each of these values for the two data sets.">http://bioinf.spbau.ru/spades</a>) is a new(ish) genome assembly program that is particularly suitable for assembly of bacterial genomes. SPAdes makes a number of improvements on the de Bruijn graph algorithms used by assemblers like Velvet, IDBA and SOAPdenovo, including iteration over values of kmer sizes, and incorporation of paired kmers (k-bimers), which allows information from paired end reads to be introduced into the computation at an earlier stage.</p>
<p>In their paper, the authors of SPAdes make comparisons with a number of popular assemblers using single-cell and multi-cell <i>E. coli</i> datasets. Their results show that SPAdes can in fact deliver better assemblies, with a higher N50 length and larger contigs. Since we regularly use Velvet for assembly, we wanted to perform some further comparisons between Velvet and SPAdes on the types of data sets we typically deal with. To this end we ran SPAdes and Velvet on 114 <i>Shigella sonnei</i> 54 bp paired end Illumina read sets (from our 2012 paper, reads available at ERP000182), and 67 unpaired 37 bp <i>Staphylococcus aureus </i>samples (from Harris 2010, available at ERP000070). We then calculated the N50, length, L50, the number of contigs &gt; 500 bp, and the total contig length produced by each assembler. The figures below show the distributions of each of these values for the two data sets.</p>
<div id="attachment_661" class="wp-caption alignnone" style="width: 610px"><a href="http://bacpathgenomics.files.wordpress.com/2013/05/ssonnei_assemblies_spades.png"><img class="size-full wp-image-661" alt="Dark = SPAdes; light = Velvet Optimizer" src="http://bacpathgenomics.files.wordpress.com/2013/05/ssonnei_assemblies_spades.png?w=600&#038;h=600" width="600" height="600" /></a><p class="wp-caption-text"><em>Shigella sonnei</em> assemblies. Dark = SPAdes; light = Velvet Optimizer</p></div>
<div id="attachment_660" class="wp-caption alignnone" style="width: 610px"><a href="http://bacpathgenomics.files.wordpress.com/2013/05/saureus_assemblies_spades.png"><img class="size-full wp-image-660" alt="Staphylococcus aureus assemblies. Dark = SPAdes; light = Velvet Optimizer." src="http://bacpathgenomics.files.wordpress.com/2013/05/saureus_assemblies_spades.png?w=600&#038;h=600" width="600" height="600" /></a><p class="wp-caption-text"><em>Staphylococcus aureus</em> assemblies. Dark = SPAdes; light = Velvet Optimizer.</p></div>
<p>There’s a few things to notice about these plots. Firstly there’s quite a bit of difference between the <i>S. shigella </i>and the <i>S. aureus</i> assembly statistics, although this is hardly surprising given that the <i>S. sonnei</i> reads are paired ends, while the <i>S. aureus</i> are unpaired (and only 37 bp long). What’s apparent though is that SPAdes and Velvet perform similarly on the <i>S. aureus </i>reads, but have very different performance on the <i>S. sonnei</i> reads, where SPAdes clearly outperforms Velvet. We can probably attribute this to the fact that SPAdes includes an improved algorithm for handling paired end reads.</p>
<p>For both species, SPAdes frequently results in a higher N50 than Velvet and a larger total contig length. For the paired end <i>S. sonnei </i>reads SPAdes also results in a larger number of contigs greater than 500 bp in length. For the L50 metric, Velvet results in quite low scores for the <i>S. sonnei</i> reads, with 10 or less contigs frequently accounting for at least half of the total contig size. However, given the total contig length obtained using SPAdes, the much higher values for L50 obtained using SPAdes are not unexpected.</p>
<p>In our current assembly pipeline, we run Velvet Optimizer a number of times using smaller and smaller steps for k in order to hone in on an optimal value. Using SPAdes, not only is this step incorporated into the assembly program, but the information from each iteration is combined to produce the final output. Thus SPAdes is able to capture the information obtained using small, highly sensitive kmer sizes, as well as large, highly specific, kmer sizes, and at the same time simplify our assembly pipeline. We’re currently beginning a new project looking at the pan-genome of a number of bacterial species, and we anticipate that this project will involve quite a bit of assembly. Based on the results shown above, and the pipeline process for SPAdes, it looks like we’ll be using SPAdes to do the bulk of this work.</p>
<p><b>Run specifics:</b></p>
<ul>
<li>Velvet was run on the <i>S. sonnei</i> pared end reads using VelvetOptimser with a minimum kmer size of 29 and a maximum kmer size of 89.</li>
<li>SPAdes was run on the <i>S. sonnei</i> pared end reads using the default settings.</li>
<li>Velvet was run on the <i>S. sonnei</i> pared end reads using VelvetOptimser with a minimum kmer size of 15 and a maximum kmer size of 35.</li>
<li>SPAdes was run on the <i>S. sonnei</i> pared end reads using the kmer sizes of 15, 23, 35.</li>
</ul>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=657&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2013/05/25/spades-vs-velvet-assemby-comparison/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/05/ssonnei_assemblies_spades.png" medium="image">
			<media:title type="html">Dark = SPAdes; light = Velvet Optimizer</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/05/saureus_assemblies_spades.png" medium="image">
			<media:title type="html">Staphylococcus aureus assemblies. Dark = SPAdes; light = Velvet Optimizer.</media:title>
		</media:content>
	</item>
		<item>
		<title>Get coverage stats for a LOT of reference sequences</title>
		<link>https://bacpathgenomics.wordpress.com/2013/04/16/get-coverage-stats-for-a-lot-of-reference-sequences/</link>
		<comments>https://bacpathgenomics.wordpress.com/2013/04/16/get-coverage-stats-for-a-lot-of-reference-sequences/#comments</comments>
		<pubDate>Tue, 16 Apr 2013 03:48:18 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=653</guid>
		<description><![CDATA[Sometimes we need to map reads to a multi-fasta reference with lots of sequences, e.g. to screen for large sets of genes or plasmids. The mapping works fine, but it can be tricksy to get alignment statistics (such as % of reads mapped) broken down by reference sequence, using common tools like Samtools, Bamtools or &#8230; <a href="https://bacpathgenomics.wordpress.com/2013/04/16/get-coverage-stats-for-a-lot-of-reference-sequences/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=653&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Sometimes we need to map reads to a multi-fasta reference with lots of sequences, e.g. to screen for large sets of genes or plasmids. The mapping works fine, but it can be tricksy to get alignment statistics (such as % of reads mapped) broken down by reference sequence, using common tools like Samtools, Bamtools or Bamstats.</p>
<p>Today we came across a tool that can do the job: <a href="http://code.google.com/p/ea-utils/wiki/SamStats">sam-stats</a> from the <a href="http://code.google.com/p/ea-utils/">ea-utils</a> fastq processing package.</p>
<p>Getting the statistics is easy:</p>
<p>sam-stats -A -B aln.bam &gt; aln-stats.txt</p>
<p>(The &#8216;A&#8217; option turns on reporting for all &#8216;chromosomes&#8217;, whilst &#8216;B&#8217; tell the program the file is a BAM)</p>
<p>The details of the output can be found on the <a href="http://code.google.com/p/ea-utils/wiki/SamStats">sam-stats wiki page</a>. There you can also find more options for statistics, as well as utilities for removing adapter sequences and de-multiplexing fastqs.</p>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=653&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2013/04/16/get-coverage-stats-for-a-lot-of-reference-sequences/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>
	</item>
		<item>
		<title>Bacterial genomics tutorial</title>
		<link>https://bacpathgenomics.wordpress.com/2013/04/13/bacterial-genomics-tutorial/</link>
		<comments>https://bacpathgenomics.wordpress.com/2013/04/13/bacterial-genomics-tutorial/#comments</comments>
		<pubDate>Sat, 13 Apr 2013 06:45:05 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Assembly]]></category>
		<category><![CDATA[Bacterial genomics]]></category>
		<category><![CDATA[German E. coli outbreak]]></category>
		<category><![CDATA[NGS]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[act]]></category>
		<category><![CDATA[assembly]]></category>
		<category><![CDATA[brig]]></category>
		<category><![CDATA[comparative analysis]]></category>
		<category><![CDATA[e. coli]]></category>
		<category><![CDATA[mauve]]></category>
		<category><![CDATA[velvet]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=628</guid>
		<description><![CDATA[This is a shameless plug for an article and accompanying tutorial I&#8217;ve just published together with David Edwards, my excellent MSc Bioinformatics student from the University of Melbourne. It&#8217;s currently available as a PDF pre-pub from BMC Microbial Informatics and Experimentation, but the web version will be available soon. The accompanying tutorial is available here. The idea for &#8230; <a href="https://bacpathgenomics.wordpress.com/2013/04/13/bacterial-genomics-tutorial/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=628&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>This is a shameless plug for an article and accompanying tutorial I&#8217;ve just published together with David Edwards, my excellent MSc Bioinformatics student from the University of Melbourne. It&#8217;s currently available as a PDF pre-pub from <em>BMC Microbial Informatics and Experimentation</em>, but the web version will be available soon. The accompanying tutorial is available <a href="http://www.microbialinformaticsj.com/imedia/1627516282963968/supp1.pdf">here</a>.</p>
<p>The idea for this came from discussions at last year&#8217;s ASM <em>(Australian Society of Microbiology) </em>meeting, where it was highlighted that there was a lack of courses and tutorials available for biologists to learn the basics of genomic analysis so that they can make use of next gen sequencing. Michael Wise, a founding editor of <em>BMC Microbial Informatics and Experimentation</em> based at UWA in Perth, suggested the new journal would be an ideal home for such a tutorial&#8230; so here we are:</p>
<h1><a href="http://www.microbialinformaticsj.com/content/3/1/2/" target="_blank">Beginner&#8217;s guide to comparative bacterial genome analysis using next-generation sequence data</a></h1>
<p><a href="http://www.microbialinformaticsj.com/content/3/1/2/" target="_blank">http://www.microbialinformaticsj.com/content/3/1/2/</a></p>
<p>High throughput sequencing is now fast and cheap enough to be considered part of the toolbox for investigating bacteria, and there are thousands of bacterial genome sequences available for comparison in the public domain. Bacterial genome analysis is increasingly being performed by diverse groups in research, clinical and public health labs alike, who are interested in a wide array of topics related to bacterial genetics and evolution. Examples include outbreak analysis and the study of pathogenicity and antimicrobial resistance. In this beginner&#8217;s guide, we aim to provide an entry point for individuals with a biology background who want to perform their own bioinformatics analysis of bacterial genome data, to enable them to answer their own research questions. We assume readers will be familiar with genetics and the basic nature of sequence data, but do not assume any computer programming skills. The main topics covered are assembly, ordering of contigs, annotation, genome comparison and extracting common typing information. Each section includes worked examples using publicly available <em>E. coli</em> data and free software tools, all which can be performed on a desktop computer.</p>
<h2><span style="color:#333399;">Four great tools</span></h2>
<p>In the paper and tutorial, we introduce the four tools which we rely on most for basic analysis of bacterial genome assemblies: Velvet, ACT, Mauve and BRIG. All except ACT were developed as part of a PhD project, and have endured well beyond the original PhD to become well-known bioinformatics tools. New students take note!</p>
<p>In the paper, each tool is highlighted in its own figure, which includes some basic instructions. This is reproduced below, but is covered in much more detail in the tutorial that comes with the paper (link at the bottom).</p>
<p><strong>1. <em>Velvet</em> for genome assembly</strong></p>
<p>Possibly the most popular and widely used short read assembler, developed by the amazing Dan Zerbino during his PhD at EBI in Cambridge. Quite a PhD project!</p>
<p>[ <a href="http://www.ebi.ac.uk/~zerbino/velvet/">Download</a> | <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2336801/">Paper</a> | <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2952100/">Protocol</a> ]</p>
<p><a href="http://bacpathgenomics.files.wordpress.com/2013/04/figure1_velvet.png"><img class="alignnone size-full wp-image-635" alt="Figure1_Velvet" src="http://bacpathgenomics.files.wordpress.com/2013/04/figure1_velvet.png?w=600&#038;h=399" width="600" height="399" /></a> <a href="http://bacpathgenomics.files.wordpress.com/2013/04/bacterial_genomics_tutorial.pdf"><br />
</a></p>
<p>Reads are assembled into contigs using <i>Velvet</i> and <i>VelvetOptimiser</i> in two steps, (1) <i>velveth</i> converts reads to <i>k</i>-mers using a hash table, and (2) <i>velvetg</i> assembles overlapping <i>k</i>-mers into contigs via a de Bruijn graph. <i>VelvetOptimiser</i> can be used to automate the optimisation of parameters for <i>velveth</i> and <i>velvetg</i> and generate an optimal assembly. To generate an assembly of <i>E. coli</i> O104:H4 using the command-line tool <i>Velvet</i>:</p>
<p>• Download <i>Velvet</i> [23] (we used version 1.2.08 on Mac OS X, compiled with a maximum <i>k</i>-mer length of 101 bp)</p>
<p>• Download the paired-end Illumina reads for <i>E. coli</i> O104:H4 strain TY-2482 (ENA accession SRR292770)</p>
<p>• Convert the reads to <i>k</i>-mers using this command:</p>
<p>velveth out_data_35 35 -fastq.gz -shortPaired -separate SRR292770_1.fastq.gz SRR292770_2.fastq.gz</p>
<p>• Then, assemble overlapping <i>k</i>-mers into contigs using this command:</p>
<p>velvetg out_data_35 -clean yes -exp_cov 21 -cov_cutoff 2.81 -min_contig_lgth 200</p>
<p>This will produce a set of contigs in multifasta format for further analysis. See Additional file 1: Tutorial for further details, including help with downloading reads and using <i>VelvetOptimiser</i>.</p>
<p><strong>2. <em>ACT</em> for pairwise genome comparison</strong></p>
<p>Part of the Sanger Institute&#8217;s Artemis suite of tools. Also look at Artemis (single genome viewer), DNA Plotter (which can draw circular diagrams of your genomes) and BAMView (which can display mapped reads overlaid on a reference genome), they are all available <a href="http://www.sanger.ac.uk/resources/software/act/">here</a>.</p>
<p>[ <a href="http://www.sanger.ac.uk/resources/software/act/">Download</a> | <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2606163/">Paper</a> | <a href="ftp://ftp.sanger.ac.uk/pub4/resources/software/act/act.pdf">Manual</a> ]</p>
<p><a href="http://bacpathgenomics.files.wordpress.com/2013/04/figure2_act.png"><img class="alignnone size-full wp-image-633" alt="Figure2_ACT" src="http://bacpathgenomics.files.wordpress.com/2013/04/figure2_act.png?w=600"   /></a></p>
<p><i>Artemis</i> and <i>ACT</i> are free, interactive genome browsers (we used <i>ACT</i> 11.0.0 on Mac OS X).</p>
<p>• Open the assembled <i>E. coli</i> O104:H4 contigs in <i>Artemis</i> and write out a single, concatenated sequence using File -&gt; Write -&gt; All Bases -&gt; FASTA Format.</p>
<p>• Generate a comparison file between the concatenated contigs and 2 alternative reference genomes using the website <a href="http://www.webact.org/"><i>WebACT</i></a>.</p>
<p>• Launch <i>ACT</i> and load in the reference sequences, contigs and comparison files, to get a 3-way comparison like the one shown here.</p>
<p>Here, the <i>E. coli</i> O104:H4 contigs are in the middle row, the enteroaggregative <i>E. coli </i>strain Ec55989 is on top and the enterohaemorrhagic <i>E. coli</i> strain EDL933 is below. Details of the comparison can be viewed by zooming in, to the level of genes or DNA bases.</p>
<p><strong>3. <em>Mauve</em> for <strong>contig ordering and </strong>multiple genome comparison</strong></p>
<p>Developed by the wonderful Aaron Darling during his PhD, he is now Associate Professor at University of Technology Sydney. Also see <a href="http://code.google.com/p/ngopt/wiki/How_To_Score_Genome_Assemblies_with_Mauve">Mauve Assembly Metrics</a>, an optional plugin for assessing assembly quality which was developed for the Assemblathon.</p>
<p>[ <a href="http://asap.ahabs.wisc.edu/mauve/">Download</a> | <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892488/">Paper</a> | <a href="http://asap.ahabs.wisc.edu/mauve/mauve-user-guide/">User Guide</a> ]</p>
<p><a href="http://bacpathgenomics.files.wordpress.com/2013/04/fig3_mauve.jpeg"><img class="alignnone size-full wp-image-634" alt="Fig3_Mauve" src="http://bacpathgenomics.files.wordpress.com/2013/04/fig3_mauve.jpeg?w=600&#038;h=218" width="600" height="218" /></a></p>
<p>Mauve is a free alignment tool with an interactive browser for visualising results (we used Mauve 2.3.1 on Mac OS X).</p>
<p>• Launch Mauve and select File -&gt; Align with progressiveMauve</p>
<p>• Click ‘Add Sequence…’ to add your genome assembly (e.g. annotated <i>E. coli </i>O104:H4 contigs) and other reference genomes for comparison.</p>
<p>• Specify a file for output, then click ‘Align…’</p>
<p>• When the alignment is finished, a visualization of the genome blocks and their homology will be displayed, as shown here. <i>E. coli </i>O104:H4 is on the top, red lines indicate contig boundaries within the assembly. Sequences outside coloured blocks do not have homologs in the other genomes.</p>
<p><strong>4. <em>BRIG</em> (BLAST Ring Image Generator) for multiple genome comparison</strong></p>
<p>From Nabil-Fareed Alikhan at the University of Queensland, also as part of a graduate project, which I believe is still in progress&#8230;</p>
<p>[ <a href="http://brig.sourceforge.net/">Download</a> | <a href="ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/">Download BLAST</a> | <a href="http://www.biomedcentral.com/1471-2164/12/402">Paper</a> | <a href="http://brig.sourceforge.net/">Tutorial</a> ]</p>
<p><img class="alignnone size-full wp-image-636" alt="Fig4_BRIG" src="http://bacpathgenomics.files.wordpress.com/2013/04/fig4_brig.png?w=600&#038;h=539" width="600" height="539" /></p>
<p>BRIG is a free tool that requires a local installation of BLAST (we used BRIG 0.95 on Mac OS X). The output is a static image.</p>
<p>• Launch BRIG and set the reference sequence (EHEC EDL933 chromosome) and the location of other <i>E. coli</i> sequences for comparison. If you include reference sequences for the Stx2 phage and LEE pathogenicity island, it will be easy to see where these sequences are located.</p>
<p>• Click ‘Next’ and specify the sequence data and colour for each ring to be displayed in comparison to the reference.</p>
<p>• Click ‘Next’ and specify a title for the centre of the image and an output file, then click ‘Submit’ to run BRIG.</p>
<p>• BRIG will create an output file containing a circular image like the one shown here. It is easy to see that the Stx2 phage is present in the EHEC chromosomes (purple) and the outbreak genome (black), but not the EAEC or EPEC chromosomes.</p>
<h3><span style="color:#333399;">Tutorial</span></h3>
<p><strong>The tutorial accompanying the article</strong> is available <a href="http://www.microbialinformaticsj.com/imedia/1627516282963968/supp1.pdf">here</a>. To give you an idea of what&#8217;s covered, here is the table of contents:</p>
<p><strong>1. Genome assembly and annotation&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230; 2</strong></p>
<p>1.1 Downloading <i>E. coli</i> sequences for assembly&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;.. 2</p>
<p>1.2 Examining quality of reads (FastQC)&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230; 2</p>
<p>1.3 Velvet – assembling reads into contigs&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 4</p>
<p>1.3.1 Using VelvetOptimiser to optimise <i>de novo </i>assembly with Velvet&#8230;&#8230;&#8230;&#8230;.. 6</p>
<p>1.4 Ordering contigs against a reference using Mauve&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 7</p>
<p>1.4.1 Viewing the ordered contigs (Mauve)&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230; 10</p>
<p>1.4.2 Viewing the ordered contigs (ACT)&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 13</p>
<p>1.5 Mauve Assembly Metrics – Statistical View of the Contigs&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230; 15</p>
<p>1.6 Annotation with RAST&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 15</p>
<p>1.6.1 Alternatives to RAST&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 19</p>
<p><strong>2. Comparative genome analysis&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;.. 20</strong></p>
<p>2.1 Downloading <i>E. coli</i> genome sequences for comparative analysis&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 20</p>
<p>2.2 Mauve – for multiple genome alignment&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 21</p>
<p>2.3 ACT – for detailed pairwise genome comparisons&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230; 24</p>
<p>2.3.1 Generating comparison files for ACT&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 24</p>
<p>2.3.2 Viewing genome comparisons in ACT&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;.. 27</p>
<p>2.4 BRIG – Visualizing reference-based comparisons of multiple sequences&#8230;&#8230;&#8230; 29</p>
<p><strong>3. Typing and specialist tools&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 34</strong></p>
<p>3.1 PHAST – for identification of phage sequences&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;. 34</p>
<p>3.2 ResFinder – for identification of resistance gene sequences&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230; 34</p>
<p>3.3 Multilocus sequence typing&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;.. 34</p>
<p>3.4 PATRIC – online genome comparison tool&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230;&#8230; 34</p>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=628&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2013/04/13/bacterial-genomics-tutorial/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/04/figure1_velvet.png" medium="image">
			<media:title type="html">Figure1_Velvet</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/04/figure2_act.png" medium="image">
			<media:title type="html">Figure2_ACT</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/04/fig3_mauve.jpeg" medium="image">
			<media:title type="html">Fig3_Mauve</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/04/fig4_brig.png" medium="image">
			<media:title type="html">Fig4_BRIG</media:title>
		</media:content>
	</item>
		<item>
		<title>Creating Pubmed RSS feeds</title>
		<link>https://bacpathgenomics.wordpress.com/2013/02/26/creating-pubmed-rss-feeds/</link>
		<comments>https://bacpathgenomics.wordpress.com/2013/02/26/creating-pubmed-rss-feeds/#comments</comments>
		<pubDate>Tue, 26 Feb 2013 11:58:09 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=555</guid>
		<description><![CDATA[Here is a fantastic if slightly geeky way of getting PubMed search results fed to you without clogging up your mail box. And it looks pretty too. http://pimpmyphd.blogspot.com.au/2012/03/how-not-to-miss-almost-any-article-in.html Unfortunately this seems to be the only post on the brilliantly named Pimp My PhD blog.<img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=555&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Here is a fantastic if slightly geeky way of getting PubMed search results fed to you without clogging up your mail box. And it looks pretty too.</p>
<p><a href="http://pimpmyphd.blogspot.com.au/2012/03/how-not-to-miss-almost-any-article-in.html">http://pimpmyphd.blogspot.com.au/2012/03/how-not-to-miss-almost-any-article-in.html</a></p>
<p>Unfortunately this seems to be the only post on the brilliantly named Pimp My PhD blog.</p>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=555&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2013/02/26/creating-pubmed-rss-feeds/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>
	</item>
		<item>
		<title>Human Microbiome Analysis &#8211; PLoS Computational Biology eBook</title>
		<link>https://bacpathgenomics.wordpress.com/2013/02/23/human-microbiome-analysis-plos-computational-biology-ebook/</link>
		<comments>https://bacpathgenomics.wordpress.com/2013/02/23/human-microbiome-analysis-plos-computational-biology-ebook/#comments</comments>
		<pubDate>Sat, 23 Feb 2013 02:29:48 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Metagenomics]]></category>
		<category><![CDATA[Microbiome]]></category>
		<category><![CDATA[16s]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[metagenomics]]></category>
		<category><![CDATA[microbiome]]></category>
		<category><![CDATA[plos]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[science]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=598</guid>
		<description><![CDATA[PLoS has made it&#8217;s first foray into &#8216;book&#8217; publishing, with it&#8217;s new collection from PLOS Computational Biology called Translational Bioinformatics. I think this is a great idea, and as founding editor Phil Bourne points out, are a far better option for both readers and authors than the &#8216;traditional&#8217; science books made up of contributed chapters, which are &#8230; <a href="https://bacpathgenomics.wordpress.com/2013/02/23/human-microbiome-analysis-plos-computational-biology-ebook/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=598&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>PLoS has made it&#8217;s first foray into &#8216;book&#8217; publishing, with it&#8217;s new collection from PLOS Computational Biology called <a href="http://www.ploscollections.org/translationalbioinformatics">Translational Bioinformatics</a>.</p>
<p>I think this is a great idea, and as <a href="http://blogs.plos.org/biologue/2013/01/23/lets-make-those-book-chapters-open-too/">founding editor Phil Bourne points out</a>, are a far better option for both readers and authors than the &#8216;traditional&#8217; science books made up of contributed chapters, which are hard to access, expensive and rarely cited.</p>
<p>Of particular interest to us in the microbial world is <a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808">Chapter 12: Human Microbiome Analysis</a> by Xochitl C. Morgan, Curtis Huttenhower from Harvard.</p>
<p>Being a book-chapter-style document, it includes some really useful &#8216;extras&#8217; including questions to get you thinking about what you&#8217;ve learnt from the article (and answers in the supplementary), a useful glossary and pointers for further reading.</p>
<p>Because it is PLoS, I can show you exactly what is in it:</p>
<div>
<h2>Abstract</h2>
<p>Humans are essentially sterile during gestation, but during and after birth, every body surface, including the skin, mouth, and gut, becomes host to an enormous variety of microbes, bacterial, archaeal, fungal, and viral. Under normal circumstances, these microbes help us to digest our food and to maintain our immune systems, but dysfunction of the human microbiota has been linked to conditions ranging from inflammatory bowel disease to antibiotic-resistant infections. Modern high-throughput sequencing and bioinformatic tools provide a powerful means of understanding the contribution of the human microbiome to health and its potential as a target for therapeutic interventions. This chapter will first discuss the historical origins of microbiome studies and methods for determining the ecological diversity of a microbial community. Next, it will introduce shotgun sequencing technologies such as metagenomics and metatranscriptomics, the computational challenges and methods associated with these data, and how they enable microbiome analysis. Finally, it will conclude with examples of the functional genomics of the human microbiome and its influences upon health and disease.</p>
</div>
<div>
<p><strong>Citation: </strong>Morgan XC, Huttenhower C (2012) Chapter 12: Human Microbiome Analysis. PLoS Comput Biol 8(12): e1002808. doi:10.1371/journal.pcbi.1002808</p>
<h3>What to Learn in This Chapter</h3>
<ul>
<li>An overview of the analysis of microbial communities</li>
<li>Understanding the human microbiome from phylogenetic and functional perspectives</li>
<li>Methods and tools for calculating taxonomic and phylogenetic diversity</li>
<li>Metagenomic assembly and pathway analysis</li>
<li>The impact of the microbiome on its host</li>
</ul>
</div>
<p><a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s5">1. Introduction</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s6">2. A Brief History of Microbiome Studies</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s7">3. Taxonomic Diversity</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s8">4. Shotgun Sequencing and Metagenomics</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s9">5. Computational Functional Metagenomics</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s10">6. Host Interactions and Interventions</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s11">7. Summary</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s12">8. Exercises</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#s13">Supporting Information</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#ack">Acknowledgments</a><br />
<a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#references">References</a></p>
<h3>Figures</h3>
<div id="attachment_601" class="wp-caption alignnone" style="width: 477px"><a href="http://www.ploscollections.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002808#pcbi-1002808-g001"><img class="size-full wp-image-601" alt="" src="http://bacpathgenomics.files.wordpress.com/2013/02/plosmicrobiomefig1_journal-pcbi-1002808-g001.png?w=600"   /></a><p class="wp-caption-text"><strong>Figure 1. Bioinformatic methods for functional metagenomics.</strong><br />Studies that aim to define the composition and function of uncultured microbial communities are often referred to collectively as “metagenomic,” although this refers more specifically to particular sequencing-based assays. First, community DNA is extracted from a sample, typically uncultured, containing multiple microbial members. The bacterial taxa present in the community are most frequently defined by amplifying the 16S rRNA gene and sequencing it. Highly similar sequences are grouped into Operational Taxonomic Units (OTUs), which can be compared to 16S databases such as Silva, Green Genes, and RDP to identify them as precisely as possible. The community can be described in terms of which OTUs are present, their relative abundance, and/or their phylogenetic relationships. An alternate method of identifying community taxa is to directly metagenomically sequence community DNA and compare it to reference genomes or gene catalogs. This is more expensive but provides improved taxonomic resolution and allows observation of single nucleotide polymorphisms (SNPs) and other variant sequences. The functional capabilities of the community can also be determined by comparing the sequences to functional databases (e.g. KEGG or SEED). This allows the community to be described as relative abundances of its genes and pathways. <br />doi:10.1371/journal.pcbi.1002808.g001</p></div>
<div id="attachment_612" class="wp-caption alignnone" style="width: 610px"><a href="http://bacpathgenomics.files.wordpress.com/2013/02/plosmicriobiomefig2_journal-pcbi-1002808-g002.jpg"><img class="size-full wp-image-612" alt="" src="http://bacpathgenomics.files.wordpress.com/2013/02/plosmicriobiomefig2_journal-pcbi-1002808-g002.jpg?w=600&#038;h=250" width="600" height="250" /></a><p class="wp-caption-text">Figure 2. Ecological representations of microbial communities: collector&#8217;s curves, alpha, and beta diversity.<br />These examples describe the A) sequence counts and B) relative abundances of six taxa (A, B, C, D, E, and F) detected in three samples. C) A collector&#8217;s curve, typically generated using a richness estimator such as Chao1 or ACE, approximates the relationship between the number of sequences drawn from each sample and the number of taxa expected to be present based on detected abundances. D) Alpha diversity captures both the organismal richness of a sample and the evenness of the organisms&#8217; abundance distribution. Here, alpha diversity is defined by the Shannon index, where pi is the relative abundance of taxon i, although many other alpha diversity indices may be employed. E) Beta diversity represents the similarity (or difference) in organismal composition between samples. In this example, it can be simplistically defined by the equation , where n1 and n2 are the number of taxa in samples 1 and 2, respectively, and c is the number of shared taxa, but again many metrics such as Bray-Curtis or UniFrac are commonly employed.<br />doi:10.1371/journal.pcbi.1002808.g002</p></div>
<p>Happy reading!</p>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=598&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2013/02/23/human-microbiome-analysis-plos-computational-biology-ebook/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/02/plosmicrobiomefig1_journal-pcbi-1002808-g001.png" medium="image" />

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/02/plosmicriobiomefig2_journal-pcbi-1002808-g002.jpg" medium="image" />
	</item>
		<item>
		<title>Colouring in country maps</title>
		<link>https://bacpathgenomics.wordpress.com/2013/02/01/colouring-maps/</link>
		<comments>https://bacpathgenomics.wordpress.com/2013/02/01/colouring-maps/#comments</comments>
		<pubDate>Fri, 01 Feb 2013 09:58:16 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=549</guid>
		<description><![CDATA[If there&#8217;s one thing I love more than a good tree it&#8217;s a good map. (And how wonderful it is when they combine!) In genomic epidemiology, maps are often super important for visualising where bacterial genomic data has come from and how bugs are moving about. Sometimes I think this is just me being a &#8230; <a href="https://bacpathgenomics.wordpress.com/2013/02/01/colouring-maps/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=549&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>If there&#8217;s one thing I love more than a good tree it&#8217;s a good map. (And how wonderful it is when they <a href="#phylogeo">combine</a>!) In genomic epidemiology, maps are often super important for visualising where bacterial genomic data has come from and how bugs are moving about.</p>
<p>Sometimes I think this is just me being a map-o-phile, but when we didn&#8217;t include a map in our recent paper on the <a href="http://www.ncbi.nlm.nih.gov/pubmed/22863732">dissemination of <em>Shigella sonnei</em></a>, the maps ended up being created anyway by the authors of two commentary articles! (Commentaries <a href="http://www.ncbi.nlm.nih.gov/pubmed/22932498">here</a> and <a href="http://www.ncbi.nlm.nih.gov/pubmed/22907165">here</a>.)</p>
<p>So, I often find I need to indicate on a global map which countries we have sampled in a study. Here are two ways to do it:</p>
<p>(1) <a href="http://www.29travels.com/geochart/">GeoChart Map Generator</a></p>
<p>This is a web tool where you just click on the name of the countries you want to colour and they are filled in on the map. The advantage is it&#8217;s quick and easy, and will produce an image good enough to take a screenshot for insertion into a PowerPoint or web page. Disadvantage is the resolution isn&#8217;t really good enough for printing out and definitely not high enough resolution for a figure in a paper.</p>
<p>Here&#8217;s an example:</p>
<p><a href="http://bacpathgenomics.files.wordpress.com/2013/01/mapgenerator.png"><img class="alignnone size-full wp-image-553" alt="mapgenerator" src="http://bacpathgenomics.files.wordpress.com/2013/01/mapgenerator.png?w=600"   /></a></p>
<p>Similar apps are available on other sites, including <a href="http://www.ammap.com/visited_countries/">http://www.ammap.com/visited_countries/</a>.</p>
<p><a href="http://www.29travels.com/geochart/getgeochart.php?j=BDINNPPKPHKRTWVNFRSEIRILDOHTMXBRGFPE&amp;c=&amp;c2=FFFFFF&amp;c3=0000FF&amp;w=650&amp;h=325">(</a>2) Edit a SVG map in Adobe Illustrator or a similar graphics editor. You can download suitable maps from Wikimedia Commons - <a href="http://commons.wikimedia.org/wiki/Category:Blank_SVG_maps_of_the_world">http://commons.wikimedia.org/wiki/Category:Blank_SVG_maps_of_the_world</a></p>
<p>This will give you a publication-quality figure, which you can save straight to PDF. To select the right countries to colour in, you&#8217;ll need to know where they are on the map (!). It can be handy to use one of the web tools first to help locate all the countries if your geography isn&#8217;t great.</p>
<p>Here&#8217;s an example, showing the countries covered in the <em>Shigella sonnei</em> study:</p>
<p><a href="http://bacpathgenomics.files.wordpress.com/2013/02/shigella_map_countriesred.png"><img class="alignnone size-full wp-image-578" alt="Shigella_Map_countriesRed" src="http://bacpathgenomics.files.wordpress.com/2013/02/shigella_map_countriesred.png?w=600"   /></a></p>
<p><a id="phylogeo">For</a> some ideas on combining trees with maps, ie phylogeography, see <a href="http://www.kuleuven.ac.be/aidslab/phylogeography/home.html">this site</a> for building tree maps for BEAST analyses using SPREAD, or try <a href="http://kiwi.cs.dal.ca/GenGIS/Main_Page">GenGIS</a>.</p>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=549&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2013/02/01/colouring-maps/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/01/mapgenerator.png" medium="image">
			<media:title type="html">mapgenerator</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2013/02/shigella_map_countriesred.png" medium="image">
			<media:title type="html">Shigella_Map_countriesRed</media:title>
		</media:content>
	</item>
		<item>
		<title>Eurosurveillance special issues on molecular epidemiology of human pathogens</title>
		<link>https://bacpathgenomics.wordpress.com/2013/01/30/eurosurveillance-special-issues-on-molecular-epidemiology-of-human-pathogens/</link>
		<comments>https://bacpathgenomics.wordpress.com/2013/01/30/eurosurveillance-special-issues-on-molecular-epidemiology-of-human-pathogens/#comments</comments>
		<pubDate>Wed, 30 Jan 2013 03:37:09 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=556</guid>
		<description><![CDATA[January 2013 heralds 2 special issues of Eurosurveillance with some great articles, including some visions of how whole genome bacterial sequencing does and will fit in with public health labs. All articles are free and open access. The tables of contents are below, note you can download the whole first issue as a single PDF. &#8230; <a href="https://bacpathgenomics.wordpress.com/2013/01/30/eurosurveillance-special-issues-on-molecular-epidemiology-of-human-pathogens/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=556&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><span style="color:#993300;"><strong>January 2013 heralds 2 special issues of Eurosurveillance with some great articles, including some visions of how whole genome bacterial sequencing does and will fit in with public health labs. All articles are free and open access. The tables of contents are below, note you can download the whole first issue as a <a href="http://www.eurosurveillance.org/images/dynamic/EE/V18N03/V18N03.pdf"><span style="color:#993300;">single PDF</span></a>.</strong></span></p>
<p><strong><a href="http://www.eurosurveillance.org/Public/Articles/Archives.aspx?PublicationId=11708">Part I: Eurosurveillance, Volume 18, Issue 3, 17 January 2013</a></strong></p>
<p>Download whole issue as PDF: <a href="http://www.eurosurveillance.org/images/dynamic/EE/V18N03/V18N03.pdf">http://www.eurosurveillance.org/images/dynamic/EE/V18N03/V18N03.pdf</a></p>
<div>Table of Contents</div>
<hr />
<div>MISCELLANEOUS</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20374">A note from the editors: molecular epidemiology of human pathogens – current use and future prospects</a></div>
<div>by Eurosurveillance editorial team</div>
<hr />
<div>RESEARCH ARTICLES</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20369">Within-patient emergence of the influenza A(H1N1)pdm09 HA1 222G variant and clear association with severe disease, Norway</a></div>
<div>by R Rykkvin, A Kilander, SG Dudman, O Hungnes</div>
<p><P></p>
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20358">Molecular epidemiological typing within the European Gonococcal Antimicrobial Resistance Surveillance Programme reveals predominance of a multidrug-resistant clone</a></div>
<div>by SA Chisholm, M Unemo, N Quaye, E Johansson, MJ Cole, CA Ison, MJ Van de Laar</div>
<p><P></p>
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20367">A snapshot of genetic lineages of Mycobacterium tuberculosis in Ireland over a two-year period, 2010 and 2011</a></div>
<div>by MM Fitzgibbon, N Gibbons, E Roycroft, S Jackson, J O’Donnell, D O’Flanagan, TR Rogers</div>
<hr />
<div>SURVEILLANCE AND OUTBREAK REPORTS</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20355">Imported diphyllobothriasis in Switzerland: molecular methods to define a clinical case of Diphyllobothrium infection as Diphyllobothrium dendriticum, August 2010</a></div>
<div>by F de Marval, B Gottstein, M Weber, B Wicht</div>
<div></div>
<div></div>
<hr />
<div>PERSPECTIVES</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20365">Molecular-based surveillance of campylobacteriosis in New Zealand – from source attribution to genomic epidemiology</a></div>
<div>by P Muellner, E Pleydell, R Pirie, MG Baker, D Campbell, PE Carter, NP French</div>
<div></div>
<div></div>
<hr />
<div>NEWS</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20357">ECDC starts pilot phase for collection of molecular typing data</a></div>
<div>by I van Walle</div>
<p><P></p>
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20359">Call for applications for EPIET and EUPHEM fellows</a></div>
<div>by Eurosurveillance editorial team</div>
<p><P></p>
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20373">New issue of EpiNorth journal is online</a></div>
<div>by Eurosurveillance editorial team</div>
<p><P><br />
<br />
<strong><a href="http://www.eurosurveillance.org/Public/Articles/Archives.aspx?PublicationId=11708">Part II: Eurosurveillance, Volume 18, Issue 4, 24 January 2013</a></strong></p>
<hr />
<div>EDITORIALS</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20386">From molecular to genomic epidemiology: transforming surveillance and control of infectious diseases</a></div>
<div>by MJ Struelens, S Brisse</div>
<div></div>
<div></div>
<div>EUROROUNDUPS</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20385">Use of multilocus variable-number tandem repeat analysis (MLVA) in eight European countries, 2012</a></div>
<div>by BA Lindstedt, M Torpdahl, G Vergnaud, S Le Hello, FX Weill, E Tietze, B Malorny, DM Prendergast, E Ní Ghallchóir, RF Lista, LM Schouls, R Söderlund, S Börjesson, S Åkerström</div>
<div></div>
<div></div>
<hr />
<div>REVIEW ARTICLES</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20380">Overview of molecular typing methods for outbreak detection and epidemiological surveillance</a></div>
<div>by AJ Sabat, A Budimir, D Nashev, R Sá-Leão, JM van Dijl, F Laurent, H Grundmann, AW Friedrich, on behalf of the ESCMID Study Group of Epidemiological Markers (ESGEM)</div>
<p><P></p>
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20382">Bioinformatics in bacterial molecular epidemiology and public health: databases, tools and the next-generation sequencing revolution</a></div>
<div>by JA Carriço, AJ Sabat, AW Friedrich, M Ramirez, on behalf of the ESCMID Study Group for Epidemiological Markers (ESGEM)</div>
<p><P></p>
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20379">Automated extraction of typing information for bacterial pathogens from whole genome sequence data: Neisseria meningitidis as an exemplar</a></div>
<div>by KA Jolley, MC Maiden</div>
<div></div>
<div></div>
<hr />
<div>SURVEILLANCE AND OUTBREAK REPORTS</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20387">Laboratory-based surveillance in the molecular era: the TYPENED model, a joint data-sharing platform for clinical and public health laboratories</a></div>
<div>by HG Niesters, JW Rossen, H van der Avoort, D Baas, K Benschop, EC Claas, A Kroneman, N van Maarseveen, S Pas, W van Pelt, JC Rahamat-Langendoen, R Schuurman, H Vennema, L Verhoef, K Wolthers, M Koopmans</div>
<div></div>
<div></div>
<hr />
<div>RESEARCH ARTICLES</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20381">Current application and future perspectives of molecular typing methods to study Clostridium difficile infections</a></div>
<div>by CW Knetsch, TD Lawley, MP Hensgens, J Corver, MW Wilcox, EJ Kuijper</div>
<div></div>
<div></div>
<hr />
<div>PERSPECTIVES</div>
<hr />
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20383">From Theory to Practice: Molecular Strain Typing for the Clinical and Public Health Setting</a></div>
<div>by RV Goering, R Köck, H Grundmann, G Werner, AW Friedrich, on behalf of the ESCMID Study Group for Epidemiological Markers (ESGEM)</div>
<p><P></p>
<div><a href="http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20384">The need for ethical reflection on the use of molecular microbial characterisation in outbreak management</a></div>
<div>by B Rump, C Cornelis, F Woonink, M Verweij</div>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=556&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2013/01/30/eurosurveillance-special-issues-on-molecular-epidemiology-of-human-pathogens/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>
	</item>
		<item>
		<title>Metagenomics analysis special issue in Briefings in Bioinformatics</title>
		<link>https://bacpathgenomics.wordpress.com/2012/11/23/metagenomics-analysis-special-issue-in-briefings-in-bioinformatics/</link>
		<comments>https://bacpathgenomics.wordpress.com/2012/11/23/metagenomics-analysis-special-issue-in-briefings-in-bioinformatics/#comments</comments>
		<pubDate>Fri, 23 Nov 2012 00:06:52 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Metagenomics]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[metagenomics]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=543</guid>
		<description><![CDATA[This month&#8217;s Briefings in Bioinformatics issue is devoted to &#8220;Bioinformatics approaches and tools for metagenomics analysis&#8220;. Yay! I love special issues, especially when they bring together lots of tools to tackle a set of similar problems. My only gripe is that only 5/11 (45%) articles in the issue are open access Luckily the best article &#8230; <a href="https://bacpathgenomics.wordpress.com/2012/11/23/metagenomics-analysis-special-issue-in-briefings-in-bioinformatics/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=543&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>This month&#8217;s Briefings in Bioinformatics issue is devoted to &#8220;<a href="http://bib.oxfordjournals.org/content/13/6.toc">Bioinformatics approaches and tools for metagenomics analysis</a>&#8220;. Yay! I love special issues, especially when they bring together lots of tools to tackle a set of similar problems.</p>
<p>My only gripe is that only 5/11 (45%) articles in the issue are open access <img src='https://s0.wp.com/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' /> </p>
<p>Luckily the best article is among the open access ones &#8211; a fantastic review of metagenomic studies, from experimental design and sampling right through to data analysis and submission to public archives, written by Hanno Teeling and Frank Glöckner from the Max Planck Institute for Marine Microbiology. Full text is online <a href="http://bib.oxfordjournals.org/content/13/6/728.full">here</a> or as a <a href="http://bib.oxfordjournals.org/content/13/6/728.full.pdf+html">PDF</a>.</p>
<p>Most of the <a href="http://bib.oxfordjournals.org/content/13/6.toc">other articles</a> cover new tools for churning through your metagenomic sequence data and figuring out what is in there in terms of function and/or taxonomy. There are many approaches to this and several tools already out there including the very beautiful <a href="http://metagenomics.anl.gov/">MG-RAST</a> and <a href="http://edwards.sdsu.edu/rtmg/">Real Time Metagenomics</a>. I have also been tinkering with these to explore the &#8220;pan-genomes&#8221; of various bacterial species where we have hundreds of genomes available&#8230; not quite what they were intended for but it seems to work quite nicely, and gives you some great insights into the spectrum of accessory genes that are flowing through various bacterial populations.</p>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=543&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2012/11/23/metagenomics-analysis-special-issue-in-briefings-in-bioinformatics/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>
	</item>
		<item>
		<title>Phage annotation with PHAST</title>
		<link>https://bacpathgenomics.wordpress.com/2012/09/08/phage-annotation-with-phast/</link>
		<comments>https://bacpathgenomics.wordpress.com/2012/09/08/phage-annotation-with-phast/#comments</comments>
		<pubDate>Fri, 07 Sep 2012 15:29:04 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Bacterial genomics]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[annotation]]></category>
		<category><![CDATA[bacteria]]></category>
		<category><![CDATA[phage]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=532</guid>
		<description><![CDATA[Just a quick post to say how much I love PHAST, the PHAge Search Tool. It looks for possible prophages in your bacterial genomes, and makes such beautiful pictures of the results, like this summary of the five phage it found in a new Salmonella genome: It also draws nice circular diagrams to show you &#8230; <a href="https://bacpathgenomics.wordpress.com/2012/09/08/phage-annotation-with-phast/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=532&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Just a quick post to say how much I love <a href="http://phast.wishartlab.com">PHAST</a>, the PHAge Search Tool.</p>
<p>It looks for possible prophages in your bacterial genomes, and makes such beautiful pictures of the results, like this summary of the five phage it found in a new Salmonella genome:</p>
<p><a href="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_image.png"><img class="alignnone size-full wp-image-533" title="PHAST_1922_image" src="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_image.png?w=600" alt=""   /></a><br />
It also draws nice circular diagrams to show you where the phage are located, like this:</p>
<p><a href="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_circular.png"><img class="alignnone size-large wp-image-534" title="PHAST_1922_circular" src="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_circular.png?w=921&#038;h=1024" alt="" width="921" height="1024" /></a></p>
<p>And it will even show you a nicely annotated figure of indidual phage it found, using an interactive Flash viewer:</p>
<p><a href="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_image_detailed_1_blog.png"><img class="alignnone size-full wp-image-535" title="PHAST_1922_image_detailed_1_blog" src="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_image_detailed_1_blog.png?w=600" alt=""   /></a></p>
<p>My only gripe is that unlike some of the more visualization-challenged phage finders, PHAST doesn&#8217;t output actual annotation files, like GenBank or GFF or even a  simple text table that would be straightforward to convert into GenBank&#8230; the format in which it prints out the actual information on where each phage is located in your sequence seems to be a home-grown text format that is not easy to parse with existing tools.</p>
<p>Oh well, I suppose I will have to write a little script to turn PHAST&#8217;s phage hunt results into a proper annotation&#8230; unless someone else has already done this?</p>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=532&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2012/09/08/phage-annotation-with-phast/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_image.png" medium="image">
			<media:title type="html">PHAST_1922_image</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_circular.png?w=921" medium="image">
			<media:title type="html">PHAST_1922_circular</media:title>
		</media:content>

		<media:content url="http://bacpathgenomics.files.wordpress.com/2012/09/phast_1922_image_detailed_1_blog.png" medium="image">
			<media:title type="html">PHAST_1922_image_detailed_1_blog</media:title>
		</media:content>
	</item>
		<item>
		<title>MLST from short read data</title>
		<link>https://bacpathgenomics.wordpress.com/2012/08/25/mlst-from-short-read-data/</link>
		<comments>https://bacpathgenomics.wordpress.com/2012/08/25/mlst-from-short-read-data/#comments</comments>
		<pubDate>Sat, 25 Aug 2012 05:05:07 +0000</pubDate>
		<dc:creator>kat</dc:creator>
				<category><![CDATA[Bacterial genomics]]></category>
		<category><![CDATA[NGS]]></category>
		<category><![CDATA[Tools]]></category>
		<category><![CDATA[genomics]]></category>
		<category><![CDATA[mlst]]></category>
		<category><![CDATA[scripts]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://bacpathgenomics.wordpress.com/?p=526</guid>
		<description><![CDATA[Our paper on a mapping-based approach to extracting MLST data from Illumina short reads was recently published in BMC Genomics. We used read mapping because this has greater sensitivity than approaches which rely on assembly, especially for low-coverage data sets of genomes with extreme GC content or other sequencing issues. The approach is called SRST &#8230; <a href="https://bacpathgenomics.wordpress.com/2012/08/25/mlst-from-short-read-data/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=526&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Our paper on a mapping-based approach to extracting MLST data from Illumina short reads was <a href="http://www.biomedcentral.com/1471-2164/13/338/abstract" target="_blank">recently published in <em>BMC Genomics</em></a>. We used read mapping because this has greater sensitivity than approaches which rely on assembly, especially for low-coverage data sets of genomes with extreme GC content or other sequencing issues. The approach is called SRST (short read sequence typing), and code and usage instructions are available from <a title="SRST" href="http://sourceforge.net/projects/srst/files/" target="_blank">srst.sourceforge.net</a>.</p>
<p>However, it is obviously useful to be able to extract MLST info from genome assemblies too. For example, many finished or WGS genome sequences in NCBI do not have ST information attached to them, or it is hard to find. Also, for 454 and perhaps Ion Torrent data, it can be easier to deal with homopolymer issues at the assembly level by using newbler/gsAssembler and then working with contigs.</p>
<p>There is a web service available that is designed to do this, i.e. you can upload your genomes and choose a MLST scheme, and it will return the ST. It is described in <a href="http://www.ncbi.nlm.nih.gov/pubmed/22238442" target="_blank">this paper</a> and available at this <a href="http://www.cbs.dtu.dk/services/MLST" target="_blank">URL</a>. However, unfortunately I have never been able to get the website to load in any of my web browsers, so I&#8217;ve not been able to try it. Also, it is a pain to have to upload large amounts of data over the web, and this becomes completely infeasible when dealing with lots of genomes, so instead I use a simple script to extract MLST info via blast, which runs locally on my laptop or cluster.</p>
<p>The script and a short readme containing usage instructions are available at: <a href="http://sourceforge.net/projects/srst/files/mlstBLAST/" target="_blank">http://sourceforge.net/projects/srst/files/mlstBLAST/</a></p>
<p>I&#8217;m sure many people have written in-house scripts for this same task, but a few people have asked for mine recently and I figure it might save some others reinventing the wheel. The script simply uses BioPython to run a set of nucleotide blast searches in order to assign STs to genome assemblies. The inputs are just the latest set of allele sequences and profiles for the MLST scheme, and whatever genome assemblies you wish to determine STs for. The script will then determine the ST for each input genome, and if an exact match can&#8217;t be found, it will try to figure out the closest matching alleles and ST.</p>
<p>Happy sequence typing!</p>
<br />  <img alt="" border="0" src="https://stats.wordpress.com/b.gif?host=bacpathgenomics.wordpress.com&#038;blog=21577595&#038;post=526&#038;subd=bacpathgenomics&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bacpathgenomics.wordpress.com/2012/08/25/mlst-from-short-read-data/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="https://2.gravatar.com/avatar/55f343205bcab8b17cf94712997ed38c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">bacpathgenomics</media:title>
		</media:content>
	</item>
	</channel>
</rss>
