STEC/EHEC outbreak – horizontally transferred genes

In the German outbreak bacteria, as in most E. coli, plenty of horizontal transfer has gone on to create the genome we are now looking at.

I’ve done about all I’m going to on this analysis, at least until some more complete data is released… but I did generate a summary plot and have a quick look at the origins of the stx, ter and other acquired genes.

This is a quick look at what the outbreak strain’s genome looks like:

Previously known sequences that are present in the outbreak genome

What is this showing us? Firstly, as established by other’s work mapping reads and contigs to the available E. coli reference genome sequences, the chromosome of the outbreak strain is most similar to strain Ec55989, an enteroaggregative E. coli (EAEC) isolated in Africa over a decade ago [central circle in figure]. It shares with this strain part of the EAEC plasmid [55989p, top right] carrying aggregative adhesion operons aat, the regulator aggR and some other bits, but it has a different aggregative adhesion fimbrial complement (AAF/I) from Ec55989. It has also acquired the stx2 phage carrying shiga-toxin 2 genes stx2A, stx2B [top left]; a plasmid sharing high similarity with the IncI plasmid pEC_Bactec, including blaCTX-M and blaTEM-1 beta-lactamase (antibiotic resistance) genes [bottom left] and a lot of sequence similar to plasmid pCVM29188_101 from Salmonella entericaKentucky [bottom left]. The circles represent the sequence of the plasmids and phage (previously sequenced and deposited in GenBank) that are most similar to sequences in the novel strain. The green rings indicate which parts of these references sequences are also present in the novel German strain (via BLAST comparison with TY2482/MIRA contigs)….so nearly all of the Ec55989 chromosome and pEC_Bactec plasmid, and not quite all of the other phage & plasmid sequences.

There is a further 300-500 kbp of sequence that doesn’t match any of these 5 reference sequences, but we can get a feel for these by searching deeper in the GenBank database via BLAST, and using the wonderful annotation provided by ERA7. [Annotation for just these contigs here.] I haven’t had a chance to look through these properly yet, but of course there is the tellurium resistance operon ter, which we expect because phenotypically the strain was noted as tellurium resistant some time ago.

The origin of the Shiga toxin phage is interesting. The toxin genes themselves (subunits A & B) are 100% identical at the nucleotide level to other stx2 toxins in NCBI, see alignment here showing precisely identical reference sequences. I mapped contigs (TY2482, MIRA assembly) to the VT2 phage to identify those that are likely to be part of the acquired phage. Using these sequences to search NCBI (nr, blastn), the closest match was to Stx2 phage I (accession AP004402, 100% identity across 81%)…but obviously the phage acquired by the German strain is a bit different because the whole of Stx2 phage I is not present (approx 20% missing, top left in figure above).

The tellerium resistance genes are also quite similar to those seen before in a variety of E. coli. I used the ERA7 annotation to identify contigs carrying the ter operon, and did a BLASTN search in NCBI for matches to these contigs. I aligned them properly with Muscle, made a bio-NJ tree and used the ‘Consensus’ function in Dendroscope (LSA tree) to combine the trees into a consensus tree. The result shows the ter operon is very similar to that found in other EHEC, especially O157:H7:

Consensus tree for ter operon (German outbreak strains highlighted)

Finally, I had a look at one contig that I noticed wasn’t present in Ec55989 but had homology to the E. coli O157:H7 Sakai chromosome… it is contig husec41_c1441, containing a probably transporter protein and two other genes of unknown function. Interestingly, a BLAST search of NCBI showed this sequence is usually chromosomally encoded, and was most similar to genes in Shigella flexneri and Shigella boydii, which cause bacterial dysentery [alignment of BLAST hits; tree drawn with FigTree this time]. So this is just a hint that there are still plenty of novel and potentially important genes to be discovered in this genome!

TY2482/MIRA assembled, contig 1441

9 thoughts on “STEC/EHEC outbreak – horizontally transferred genes

    • Hi, that’s a good question.
      No, I can’t see any stx genes in the Ec55989 genome (blastn search of Ec55989 chromosome and 55989p plasmid using stx2a and stx2b genes).
      The Ec55989 genome was described in Touchon 2009 as causing diarrhea:

      Enteroaggregative E. coli strain 55989 was originally isolated from the diarrheagenic stools of an HIV-positive adult suffering from persistent watery diarrhea in Central African Republic

      The paper you mention (Mossoro 2002), which Touchon gives as the reference for Ec55989, discusses a whole set of E. coli strains from this location, including 8 that were HUS and EHEC of which 7 had stx2. The strains aren’t given names in Mossoro 2002, but since it is described as causing diarrhea and not HUS/EHEC we can probably assume that Ec55989 was not one of the stx2-bearing HUS/EHEC and so we wouldn’t expect to see the stx2 genes.
      Kat

  1. Hi Kat,

    I’m not completely sure but it seems that the link to the alignment of BLAST hits of the contig husec41_c1441 points to a fasta file (not to a blast result)

    I’d like to take a look at such blast results

    thanks!

  2. So, was this new E. coli strain created in a lab? A biowarfare attack just before Merkel’s visit, when there’s so much at stake economically?

    • No, there is basically zero chance of this. Bacteria do this all the time, they’ve been sharing genes for millenia and we are only just catching up and figuring out how they do it!
      As all the analysis shows (mine & others, collated here), all the genes present in this strain have been seen in E. coli before, and in fact there is an E. coli strain from Germany in 2001 which has all the same features. This is just a case of a bacteria that is usually seen in animals spilling over into the human food chain and causing some havoc.
      Unfortunate but not that unusual.

    • Hi Dave,
      I used DNAplotter, which comes with Artemis (free & Java based, http://www.sanger.ac.uk/resources/software/artemis/).
      I just converted the results of blast searches into an EMBL file (with regions of >95% identity as features), load up the reference genome and this EMBL file in Artemis, and select Open In DNAPlotter.
      There are other tools too, e.g. BRIG.
      I’ll put up a bit of a walk-through of these analyses when I have time, I realise I’ve been a bit cursory in my explanations so far!
      Kat

  3. Pingback: E. coli TY2482 genome compared versus E. coli EAEC strain 55989 | ExCePD

Leave a reply to kat Cancel reply