Chlamydia genomics

There are a wide variety of genome sequences available for strains of the chlamydiae.   To find the current list of publicly available chlamydial genome sequences, search the NCBI genome analysis website using “chlamydia” or “chlamydophila” as keywords.  Collaborative efforts between our Oregon State University and University of Washington laboratories has worked to make genome sequencing a routine and relatively inexpensive tool for the investigation of recent clinical C. trachomatis isolates.  Our work in genome sequence analysis began with the use of the Joint Genome Institute’s Sequencing facility to generate complete sequences of six clinical strains.  Three of these strains were of serovar G, two were of serovar E, and one was a unique nonfusogenic serovar D strain [3].  While it was contemporary and the quality of work from individuals at the JGI was excellent, this first venture into genomics was time consuming and relatively expensive.  After these sequences were generated, the Oregon State University Center for Genome Research and Biocomputing purchased an Illumina 1G large-scale sequencing platform, which changed the nature of our approach to genomics. After the raw data is generated, software programs including VCAKE [1] and MAQ and the MacVector software package are used to assemble and compare the contigs.  We currently can produce complete genome sequences approximately 3 weeks after purification of genomic DNA for under $1,000 per strain, and the cost and time required per genome will continue to decrease.  Physical maps of genomes- i.e. contribution of regions of a recombinant genome by individual parents- can be completed in one day following generation of primary unassembled sequence data.  Chlamydial genomes are particularly well suited for this approach- the genomes are small (~1 megabase pairs), there is very limited repeat sequence, and excellent reference strains are available publicly and/or are accumulating in our genome sequence database.  While the original genomes sequenced through our CGRB were assembled with Illumina reads of 32 base pairs, the Illumina technology has improved, and current reads are up to 72 base pairs in length.  Additionally, the Illumina technology now incorporates a “paired-end” function, in which individual sequence reads are generated on both ends of a particular DNA fragment.  This allows sequence contigs to be generated using physically linked oligonucleotide sequences.  The combination of longer sequence reads and the paired end function of the Illumina apparatus have greatly simplified our assembly of genomes.  We continue to explore techniques to reduce costs and turnaround time for genome sequence analysis.  

As of February 2010, we have completed 16 genomes including clinical isolates of serovars D, E, F, G, and J, and recombinant genomes that are hybrids of C. trachomatisC. suis, and C. muridarum [2].   Several more are in progress.  The general characteristics of these genomes are consistent with published genome sequences: ~1.04 megabase pairs, ~900 open reading frames, and similar G + C ratios.  Individuals in our laboratory developed a sequence analysis program called Diffsort, which allows detailed comparisons of closely related genomes [3].  This program has been very useful in the analysis of our sequenced strains. 

We continue to use rapid sequence analysis to generate information about chlamydial recombination and the way that minor changes in pathogen genome sequence affects the disease caused by that organism in patients.



1.  Jeck, W.R., J.A. Reinhardt, D.A. Baltrus, et al., Extending assembly of short DNA sequences to handle error. Bioinformatics. 23: 2942-2944 (2007).
2. Suchland, R. J., K. M. Sandoz, B. M. Jeffrey, W.E. Stamm and D. D. Rockey. 2009.  Horizontal transfer of tetracycline resistance among Chlamydia spp. in vitro. Antimicrobial Agents and Chemotherapy  53:4604-4611 PMID: 19687238
3. Jeffrey, B.M., R.J. Suchland, K.L. Quinn, J.R. Davidson, W.E. Stamm, and D. D. Rockey. 2010 Genome Sequencing of Recent Clinical Chlamydia trachomatis Strains Identifies Loci Associated with Tissue Tropism and Regions of Apparent Recombination. Infection and Immunity 78:2544-2553. PMID: 20308297