Updated Summary: The clear differences so far between the german outbreak strains and the similar EAEC genome Ec55989 are:
- Stx2 phage, see below for alignment to the VT2 Sakai phage (details below)
- IncI resistance plasmid including blaTEM and blaCTX-M, similar to those found in other E. coli and Shigella
- An aggregative adherence fimbrial cluster (aggABCD) whose sequence was published in 1994 (U12894, ref in pubmed central), mobilised by IS and present on a plasmid (details below)
Aggregative adherence fimbrial cluster This was published in 1994 by Savarino et al (U12894, ref in pubmed central) from an enteroaggregative E. coli (EAEC or EAggEC). Interestingly, there is no other DNA sequence in NCBI (nr database) that maps to this operon, which is described as encoding aggregative adherence fimbriae I (AAF/I). The sequenced EAEC strains contain plasmids bearing AAF/II and AAF/III fimbriae but not AAF/I (it seems there is usually only 1). Some detailed analysis below, more to come.
Could this be a suitable PCR target? The operon does not seem to have been reported in STEC or EHEC before, so might be a way to discriminate between the outbreak strain and other similar organisms. Nico Petty agrees…what do others think?
It could be a reason behind enhanced transmission (better adhesion to vegetables and other surfaces) and or enhanced virulence/longer infection times (better adhesion to human cells).
UPDATE: More clues as to the insertion of the agg operon (U12894, ref in pubmed central) in the German outbreak strains….basically it is flanked by IS elements which presumably help it to mobilise as a unit, and this mobile element has been inserted into a plasmid similar to pCVM29188_101 from Salmonella Kentucky, which has been acquired by the STEC strain.
Firstly, the complete operon is definitely in there and probably intact, see mapping of BGI reads to the agg operon (U12894):
(top row = pileup using Artemis/BamView; reads wer mapped using bwa with default parameters…probably not the greatest for Ion Torrent data but I haven’t used this kind of data before so this is just a first attempt!)
In the BGI assembly of TY2482, it is clear that the agg operon is present, probably intact (although homopolymer errors appear to be messing up the aggC gene) and flanked by IS elements:
Even more telling in terms of mechanisms of horizontal transfer, in the LB2226692 genome, contig 26 contains the complete aggD gene and N-terminal of aggC at one end and the rest of the contig maps to several plasmids including:
Mapping the LB2226692 assembly to the plasmid pCVM29188_101, we get 102 kb of similar sequence, including our contig 26 carrying the aggC and aggD genes and IS (contig highlighted in green) [ordered contigs and ACT comparison here]:
So the consensus is emerging that the German outbreak strains are an enteroaggregative E. coli (EAEC, similar to Ec55989 causing diarrhea in children in Africa), which has acquired a Shiga toxin phage. Best evidence for this is David Studholme’s comparison of the novel genomes to all available E. coli genomes, which shows that it Ec55989 is the most similar by gene content (sharing 96% of the genes from the outbreak strain), coupled with Konrad Paszkiewicz’s phylogenetic tree which shows that, within these genes, the sequences from the outbreak strains and Ec55989 are near-identical at the DNA level.
Nico Petty and I have been looking at what the novel genes are, and found what other people are reporting – that the EAEC has acquired the shiga toxin phage (ordered contigs here):
We could find no sign of the LEE pathogenicity island (associated with enterohemorrhagic E. coli), so at this stage it does seem that it is mainly a matter of acquiring the Shiga toxin via phage, coupled with the acquisition of one or two plasmids encoding drug resistance which are common in E. coli and other Enterobacteriaceae (see this post). We also note the acquisition (well, compared to Ec55989) of the ter operon (tellerium resistance; contig husec41_c1243), yeh gene cluster (contig husec41_c1441) and a cluster of unknown genes (contig husec41_c750) which are also present in the O157:H7 Sakai chromosome. There is also a microcin operon (contig husec41_c72) which is highly similar to that found in another enteroagreggative E. coli O42. So far I could find no real differences in gene content between the two outbreak strain genomes, TY2482 and LB2226692, and very few SNPs, as one might expect given rapid transmission.
It will be interesting to see what more can be found as the assemblies of the strains are improved with additional data. While the analysis so far suggests that this is a classic case of E. coli sharing genes via various mechanisms of horizontal transfer (i.e. bacteria doing what bacteria do), it will be very interesting to tease out the subtleties of the virulence genes and how they interplay to result in this particularly virulent bug.
One thing that still remains is the question of whether, and how, this strain sticks to vegetables, which appears to be a significant factor in its successful transmission. Being an agreggative E. coli, this strain (like its sister strain Ec55989) carries the Ag-43 gene which is involved in biofilm formation and autoaggregation, which may turn out to be relevant. In addition, there are a few fimbrial genes annotated (in both the BG7 annotation and Torsten Seeman’s annotation of Nick Loman’s MIRA assembly of BGI’s TY2428 data) but they are currently each in their own contigs, so it’s not really possible to get an idea of the genetic context in which they were transferred in. However, they do all appear to map to the same aggABCD operon (accession ECU12894), associated with plasmid-borne aggregative adherence to Hep2 cells in EAEC (but not present in either the Ec55989 or the O42 EAEC genome) (contigs here):