The authors have declared that no competing interests exist.
This is a “Topic Page” article for
In coining the term
Rapid expansion of a virus in a population will be reflected by a “star-like” tree, in which external branches are long relative to internal branches. Star-like trees arise because viruses are more likely to share a recent common ancestor when the population is small, and a growing population has an increasingly smaller population size towards the past. Compared to a phylogeny of an expanding virus, a phylogeny of a viral population that stays constant in size will have external branches that are shorter relative to branches on the interior of the tree. The phylogeny of
Viruses within similar hosts, such as hosts that reside in the same geographic region, are expected to be more closely related genetically if transmission occurs more commonly between them. The phylogenies of
Red and blue circles represent spatial locations from which viral samples were isolated.
The effect of directional selection on the shape of a viral phylogeny is exemplified by contrasting the trees of
Although these three phylogenetic features are useful rules of thumb to identify epidemiological, immunological, and evolutionary processes that might be impacting viral genetic variation, there is growing recognition that the mapping between process and phylogenetic pattern can be many-to-one. For instance, although ladder-like trees such as the one shown in
Phylodynamic models may aid in dating epidemic and pandemic origins. The rapid rate of evolution in viruses allows
Phylodynamic models may provide insight into epidemiological parameters that are difficult to assess through traditional surveillance means. For example, assessment of
Phylodynamic approaches can also be useful in ascertaining the effectiveness of viral control efforts, particularly for diseases with low reporting rates. For example, the genetic diversity of the DNA-based
Viral control efforts can also impact the rate at which virus populations evolve, thereby influencing phylogenetic patterns. Phylodynamic approaches that quantify how evolutionary rates change over time can therefore provide insight into the effectiveness of control strategies. For example, an application to HIV sequences within infected hosts showed that viral substitution rates dropped to effectively zero following the initiation of antiretroviral drug therapy
Most often, the goal of phylodynamic analyses is to make inferences of epidemiological processes from viral phylogenies. Thus, most phylodynamic analyses begin with the reconstruction of a phylogenetic tree. Genetic sequences are often sampled at multiple time points, which allows the estimation of
Traditional evolutionary approaches directly utilize methods from
the magnitude of selection can be measured by comparing the rate of nonsynonymous substitution to the rate of synonymous substitution
the population structure of the host population may be examined by calculation of
hypotheses concerning panmixis and selective neutrality of the virus may be tested with statistics such as
However, such analyses were not designed with epidemiological inference in mind and it may be difficult to extrapolate from standard statistics to desired epidemiological quantities.
In an effort to bridge the gap between traditional evolutionary approaches and epidemiological models, several analytical methods have been developed to specifically address problems related to phylodynamics. These methods are based on
The coalescent is a mathematical model that describes the ancestry of a sample of
This time interval is labeled
The expected waiting time to find the MRCA of the sample is the sum of the expected values of the internode intervals,
Two corollaries are:
The time to the MRCA (TMRCA) of a sample is not unbounded in the sample size,
Few samples are required for the expected TMRCA of the sample to be close to the theoretical upper bound, as the difference is
Consequently, the TMRCA estimated from a relatively small sample of viral genetic sequences is an asymptotically unbiased estimate for the time that the viral population was founded in the host population.
For example, Robbins et al.
If the population size N(
Because all topologies are equally likely under the neutral coalescent, this model will have the same properties as the constant-size coalescent under a rescaling of the time variable:
Very early in an epidemic, the virus population may be growing exponentially at rate
This rate is small close to when the sample was collected (
If the rate of exponential growth is estimated from a gene genealogy, it may be combined with knowledge of the duration of infection or the
For example, Fraser et al.
Infectious disease epidemics are often characterized by highly nonlinear and rapid changes in the number of infected individuals and the effective population size of the virus. In such cases, birth rates are highly variable, which can diminish the correspondence between effective population size and the prevalence of infection
The ratio
For the simple SIR model, this yields
This expression is similar to the
Early in an epidemic,
This has the same mathematical form as the rate in the Kingman coalescent, substituting
When a disease is no longer exponentially growing but has become endemic, the rate of lineage coalescence can also be derived for the epidemiological model governing the disease's transmission dynamics. This can be done by extending the
For example, for the SIR model above, modified to include births into the population and deaths out of the population, the population size N is given by the equilibrium number of infected individuals,
This rate, derived for the SIR model at equilibrium, is equivalent to the rate of coalescence given by the more general formula provided by Volz et al.
At the most basic level, the presence of geographic population structure can be revealed by comparing the genetic relatedness of viral isolates to geographic relatedness. A basic question is whether geographic character labels are more clustered on a phylogeny than expected under a simple nonstructured model (see
Beyond the presence or absence of population structure, phylodynamic methods can be used to infer the rates of movement of viral lineages between geographic locations and reconstruct the geographic locations of ancestral lineages. Here, geographic location is treated as a phylogenetic character state, similar in spirit to “A,” “T,” “G,” and “C,” so that geographic location is encoded as a
As discussed above, it is possible to directly infer parameters of simple
Simulation-based models require specification of a transmission model for the infection process between infected hosts and susceptible hosts and for the recovery process of infected hosts. Simulation-based models may be
To connect the epidemiological model to viral genealogies requires that multiple viral strains, with different nucleotide or amino acid sequences, exist in the simulation, often denoted
For
In general, in needing to run simulations rather than compute likelihoods, it may be difficult to make fine-scale inferences on epidemiological parameters, and instead, this work usually focuses on broader questions, testing whether overall genealogical patterns are consistent with one epidemiological model or another. Additionally, simulation-based methods are often used to validate inference results, providing test data where the correct answer is known ahead of time. Because computing likelihoods for genealogical data under complex simulation models has proven difficult, an alternative statistical approach called
Human influenza is an acute respiratory infection primarily caused by viruses
Phylodynamic techniques have provided insight into the relative selective effects of mutations to different sites and different genes across the influenza virus genome. The exposed location of hemagglutinin (HA) suggests that there should exist strong selective pressure for evolution to the specific sites on HA that are recognized by antibodies in the human immune system. These sites are referred to as
Further analysis of HA has shown it to have a very small
Influenza A/H1N1 shows a larger effective population size and greater genetic diversity than influenza H3N2
The extremely rapid turnover of the influenza population means that the rate of geographic spread of influenza lineages must also, to some extent, be rapid. Surveillance data show a clear pattern of strong seasonal epidemics in temperate regions and less periodic epidemics in the tropics
All of these phylogeographic studies necessarily suffer from limitations in the worldwide sampling of influenza viruses. For example, the relative importance of tropical Africa and India has yet to be uncovered. Additionally, the phylogeographic methods used in these studies (see section on phylogeographic methods) make inferences of the ancestral locations and migration rates on only the samples at hand, rather than on the population in which these samples are embedded. Because of this, study-specific sampling procedures are a concern in extrapolating to population-level inferences. However, through joint epidemiological and evolutionary simulations, Bedford et al.
Forward simulation-based approaches for addressing how immune selection can shape the phylogeny of influenza A/H3N2's hemagglutinin protein have been actively developed by disease modelers since the early 2000s. These approaches include both
Later work by Ferguson and colleagues
Work by Koelle and colleagues
Instead of modeling the genotypes of viral strains, a compartmental simulation model by Gökaydin and colleagues
In recent work, Bedford and colleagues
Although most research on the phylodynamics of influenza has focused on seasonal influenza A/H3N2 in humans, influenza viruses exhibit a wide variety of phylogenetic patterns. Qualitatively similar to the phylogeny of influenza A/H3N2's hemagglutinin protein (see
Genetic and antigenic variation of influenza is also present across a diverse set of host species. The impact of host population structure can be seen in the evolution of
The global diversity of HIV-1 group M is shaped by its
The rate of exponential growth of HIV in Central Africa in the early 20th century preceding the establishment of modern subtypes has been estimated using coalescent approaches. Several estimates based on parametric exponential growth models are shown in
Growth Rate | Group | Subtype | Risk Group |
0.17 |
M | NA | Central Africa |
0.27 |
M | C | Central Africa |
0.48 |
M | B | North America/Eur/Aust, MSM |
0.068 |
O | NA | Cameroon |
The early growth of subtype B in North America was quite high, however the duration of exponential growth was relatively short, with saturation occurring in the mid- and late-1980s
HIV-1 sequences sampled over a span of five decades have been used with relaxed molecular clock phylogenetic methods to estimate the time of cross-species viral spillover into humans around the early 20th century
At shorter time scales and finer geographical scales, HIV phylogenies may reflect epidemiological dynamics related to risk behavior and
By analyzing phylogenies estimated from HIV sequences from
Purifying immune selection dominates evolution of HIV within hosts, but evolution between hosts is largely decoupled from within-host evolution (see
Sequences were downloaded from the
There is some evidence from comparative phylogenetic analysis and epidemic simulations that HIV adapts at the level of the population to maximize transmission potential between hosts
Up to this point, phylodynamic approaches have focused almost entirely on RNA viruses, which often have mutation rates on the order of 10−3 to 10−4 substitutions per site per year
Additionally, improvements in sequencing technologies will allow detailed investigation of within-host evolution, as the full diversity of an infecting
Version history of the text file.
(XML)
Peer reviews and response to reviews. Human-readable versions of the reviews and authors' responses are available as comments on this article.
(XML)
This review benefited from discussions arising from a 2011 RAPIDD workshop on viral phylodynamics. The version history of the text file and the peer reviews (and response to reviews) are available in