Two new assemblies of the German E. coli outbreak strains were released today, one from BGI (452 scaffolds/contigs; Illumina Hiseq paired end, 500bp inserts) and one from HPA (13 scaffolds, 454 mate pair). In the HPA assembly, the resistance genes for streptomycin, trimethoprim, sufamethoxazole, streptomycin and mercury (some of which are carried by a Tn21 transposon and IntI1 integron) are present in the same scaffold as the Ec55989 chromosome (scaffold 2). The picture below shows the mapping of this scaffold 2 to the chromosome in blue, the IncI plasmid in green (which carries the blaTEM and blaCTX-M genes) and the resistance genes of pAKU_1 in red (plasmid from Paratyphi A, using this as a reference because this is a plasmid I’ve worked with previously so am familiar with interpreting).
What you can see is that most of the scaffold maps to the chromosome, but the the red resistance genes are also present (near 100 kbp marker). Upstream of the resistance genes, in this scaffold at least, there are some additional regions of low similarity with the Ec55989 chromosome, consistent with a whole stretch of DNA being inserted into the chromosome. Very little of this has any homology to the IncI plasmid which we know is present in the strain, consistent with the idea that these resistance genes are not present on the IncI plasmid. This whole region is conserved in the BGI genome, shown in purple.
This could all be a scaffolding and assembly error, but looking at the mate pair reads should confirm or deny this.
Update: A closer look at the region suggests the scaffold may be correct. Below is a mapping of the new scaffold (second line) against Ec55989 (top line) and E. coli S88 (bottom line), showing the site of the possible insertion. The novel sequence is inserted into a tRNA sequence, which is typical of many integrase-mediated insertions. On the right of the insertion is an integrase gene with 100% identity to integrase sai in the Shigella flexneri 2002017. Part of the tRNA is duplicated on the left hand side of the insertion, again typical of a real integrase-mediated insertion.
So what exactly is in the insertion? To the left is a stretch of sequence with homology to pathogenicity island genes from several E. coli and Shigella genomes, including E. coli S88, E. coli 042, E. coli SE15, Shigella flexneri 2a SRL pathogenicity island. The homology to S88 is is shown in the figure. This region contains a protein with an autotransporter domain, annotated in some genomes as flu, Ag43, others as aidA-like adhesin, etc. So it is associated with adhesion.
Here is a phylogenetic tree showing the closest matching proteins (by NCBI blastp):
And the closest matching DNA sequences (by NCBI blastn):
Acquired multidrug resistance
The rest of the insertion contains small hypothetical genes of unknown function, plus several common mobile elements associated with drug resistance.
Immediately adjacent to the “pathogenicity island” sequences described above (and present in the same contig) are part of tniAdelta (part of Tn21), a pecM-like permease and two tetracycline resistance genes (tetA, tetR):
The next contig contains a mercury resistance operon usually found in Tn21:
And the next contig a sequence containing strA, strB (resistance to streptomycin) and sul2 (resistance to sulfonamides… and then a new contig containing part of a Tn21-like transposon including tnpA, tnpM, tnpR and an IntI1-like integron which appears to have two resistance genes in the cassette (sul1 and a dihydrofolate reductase, I think it is A7). Next door in the same contig is an IS1 transposase and then the sai integrase, and then we are back to the tRNA-Sec sequence and chromosomal genes:
So given the sequence context, it is likely that the scaffold is correct in grouping these contigs together in this order, as it looks like a common and plausible gene order, with a possible mechanism for mobilisation. I’m used to seeing these resistance genes in plasmids, so to convince myself they can also be integrated into the chromosome I had a quick look for similar integrations in other E. coli. Here is one with a very similar set of resistance elements, even in the same order, inserted into the chromosome of EAEC E. coli 042 genome (although in this case it is not associated with the adhesion element mentioned above).
Data: My manual annotation of this region in the HPA assembly is available here. It can be loaded into Artemis as an entry on top of the HPA scaffold. ACT comparisons on request but you can easily make your own using WebACT.