The Dengue, Zika & Chikungunya Typing Tool: Introduction.
Recently, an increasing number of outbreaks of Dengue, Chikungunya and Zika have been reported in tropical and sub-tropical areas of the globe. The same vectors (Ae. aegypti or Ae. albopictus), which are widely adapted and distributed on a global scale, transmit the Dengue (DENV), Zika (ZIKV) and Chikungunya (CHIKV) viruses. Outbreaks can happen in the same geographical area at the same time, making the differential diagnostic a difficult task for front line heath personnel and the health authorities.
With the advent of next generation sequencing methods, thousands of complete DENV genomes, as well as the first few complete sequences of CHIKV, and ZIKV are now available. In this context, we have used the available data to develop an automated computational algorithm to be used for accurate and rapid identification of these virus pathogens. In order to make classification consistent, we also present a detailed description of the methods for amplification and genotyping that are most appropriate to accurately classify those pathogens. The computational and laboratory methods are made freely available in an open source platform and should allow high-throughput and timely identification of the pathogens to strain level. These advances should have major importance for clinical management of patients and strategic decisions of health authorities.
Genomic regions for Zika, Dengue and Chikungunya species and genotype classification
Dengue Serotypes and Genotypes
The definition of a dengue genotype is based on old studies with partial sequencing and sequenced strains in the public domain. Time has passed and the virus has diverged into more complex lineages. We consider that the virus will continue to evolve and new lineages might turn into new serotypes and genotypes in time. We also consider a wide diversity of variables to define a genotype, i.e. monophily, pairwise distance within group and between groups, net genetic diversity within groups, etc.
We understand that there is no good consensus in the dengue world for naming genotypes. The global spread of the virus has made the previous naming of genotypes by geographic association irrelevant. DENV-1, -3, and -4 have genotypes names by roman numerals (i.e. I, II, III, etc.) but there are still publications with contrasting classifications.
Geographic association names DENV-2 genotypes. Considering that geographic association of certain genotypes might be too difficult, we find that using roman numerals (i.e. I, II, III, etc.) could be more appropriate. In addition, a numeral system might also be beneficial to stay in structure with new diverging lineages and genotypes. However, in our tool we still maintain geographic association names for DENV-2 (e.g. like the SE Asia and American genotypes of DEN2) as researchers that deal the public health issue are accustomed to these designations.
We have carefully selected reference sequences that represent the diversity of each genotype. In addition, we performed extensive testing to be sure that our reference strains accurately classify other sequences.
- We include serotypes 1, 2, 3 and 4.
- Serotype 1 includes 1I ,1II, 1III, 1IV and 1V genotypes.
- Serotype 2 includes 2I (American), 2II (Cosmopolitan), 2III (Southern Asian-American), 2IV (Asian II), 2V (Asian I) and 2VI (Sylvatic) genotypes.
- Serotype 3 includes 3I, 3II, 3III and 3V genotypes.
- Serotype 4 includes 4I, 4II, 4III and 4IV genotypes.
Partial vs. whole genome. Whole genome sequences (WGS) are not commonly produced for epidemiological purpose or are available in the public domain. Most of the WGS came from the GRID consortium sponsored by the Broad Institute of Harvard and MIT a few years ago. As a result of this project, collaborating sites deposited hundreds of WGS in GenBank, however, redundancy still exists in these datasets, i. e. many samples from one region, same year. There is consensus that the envelope glycoprotein gene (E) is good for phylogenetic classification. This gene has been very effective since it contains sufficient phylogenetic signal (1,485bp) to identify dengue from other viruses and differentiate between serotypes and genotypes. Phylogenetic trees inferred with E gene sequence data generate topologies very similar to whole genome trees. Therefore, investigators working with dengue sequence can expect a similar accuracy for classification of WGS and E gene.
Fig. 2. Dengue, Zika & Chikungunya Viruses phylogenetic tree for (A) WGS and (B) E gene
The definition of a Chikungunya genotype is based on the identification of well defined phylogenetic clusters that its origin has been associated to a given geographic region.
We have carefully analysed all of CHIKV whole genomes and reviewed in the literature in order to represent the diversity of each phylotype. In addition, we performed extensive testing to be sure that our reference strains accurately classify other sequences.
- We include 3 phylotypes
- Phylotype 1 includes sequences from Asian and Caribbean
- Phylotype 2 includes sequences from East-Central-South-African.
- Phylotype 3 includes sequences West Africa
Fig. 3. Chikungunya maximum likelihood phylogenetic tree of the (A) complete genome and (B) E1 gene
Zika Virus Genotypes
The definition of a Zika genotype is based on the identification of well defined phylogenetic clusters that its origin has been associated to a given geographic region.
We have carefully analysed all of ZIKV whole genomes and reviewed in the literature in order to represent the diversity of each phylotype. In addition, we performed extensive testing to be sure that our reference strains accurately classify other sequences. We found that the whole genome and the Envelope gene (E) gene to be suitable for classification
- We include 3 phylotypes
- Phylotype 1 includes sequences from deferent African countries
- Phylotype 2 includes sequences from Asian origin
- Phylotype 3 includes a recently published Senegalese whole genome, which is very divergent from other African ones (Faye et al. PLoS Negl Trop Dis 2014, 8(1): e2636)
- The phylogenies are rooted with Spondweni virus
Fig. 4. Zika virus maximum likelihood phylogenetic tree of (A) complete genome and (B) E gene