Analysis by Philippe Lemey, 1 May 2009
I have been applying Bayesian phylogeographic reconstruction (Lemey, Rambaut, Drummond and Suchard, submitted) to the data set compiled by Andrew on 29 April. This data set has sequences representative of 7 locations: California, Kansas, Mexico, New York, Ohio, Texas and Auckland.
Several models were applied:
|
CTMC |
BSSVS |
rate prior |
Marginal Likelihood |
| 1 |
reversible |
- |
equal |
-18738.396 +/- 0.161 |
| 2 |
reversible |
- |
distance-based |
-18737.379 +/- 0.17 |
| 3 |
reversible |
+ |
equal |
-18738.29 +/- 0.165 |
| 4 |
reversible |
+ |
distance-based |
-18737.469 +/- 0.155 |
| 5 |
nonreversible |
- |
equal |
-18738.334 +/- 0.148 |
| 6 |
nonreversible |
- |
distance-based |
-18736.329 +/- 0.171 |
| 7 |
nonreversible |
+ |
equal |
-18737.338 +/- 0.186 |
| 8 |
nonreversible |
+ |
distance-based |
-18736.509 +/- 0.172 |
| 9 |
nonreversible (rev rates, nonrev ind) |
+ |
equal |
-18737.526 +/- 0.16 |
| 10 |
nonreversible (rev rates, nonrev ind) |
+ |
distance-based |
-18736.724 +/- 0.168 |
CTMC = Continuous Time-reversible Markov Chain, BSSVS = Bayesian stochastic search variable selection
Marginal likelihoods do not indicate pronounced differences among these models for this limited data set. However, distance-based priors generally improve the marginal likelihoods. Ranking the models based on the marginal likelihoods would yield model 6 on top (which doesn't require BSSVS). The root state probabilities for this model are shown below; California and Mexico share most of the posterior mass. Despite being the apparent immediate source of the human epidemic, Mexico actually has a marginally lower (though not significantly) state probability for the root than California (0.275 vs. 0.334) reflecting the uncertainty in the data and the currently biased sampling towards US strains being earlier.
Thanks to Marc Suchard for model development!
A Brief Overview of Bayesian Phylogeographic Inference
The Bayesian phylogeographic inference employs a discrete diffusion model that can be fitted simultaneously with well-established models of sequence evolution in a Bayesian genealogical approach (Lemey et al). By integrating a continuous-time Markov chain (CTMC) model for discretized diffusion in a statistical framework centered on time-scaled phylogenies, we infer spatial dynamics in real timescales. To achieve statistical efficiency, we propose a Bayesian stochastic search variable selection (BSSVS) procedure that allows rates to become zero with some prior probability. Using such a procedure, the inference arrives at a minimal set of spatial diffusion rates that appropriately explain the phylogeographic process.
The standard assumption of reversibility in the CTMC above may not always reflect realistic epidemiological diffusion processes. To address this shortcoming, the CTMC model has been extended to allow for possibly different rates of transition between locations depending on the direction of the diffusion pathway. Considering that an n x n rate matrix characterizes a CTMC-based diffusion between n locations, the nonreversible matrix contains n(n-1) off-diagonal, non-negative rate parameters lij for i,j = 1,…,n. Also in this framework, a BSSVS procedure can be employed. Specific procedures have been implemented to achieve this and a manuscript is in preparation describing the technical details. The Bayesian phylogeographic inference methods to simultaneously estimate spatial processes and gene genealogies using MCMC sampling is implemented in BEAST.
An example of a Google Earth visualization of the diffusion process inferred from H5N1 hemagglutinin sequences is shown below.

/groups/influenza/search/index.rss?sort=modifiedDate&sortDirection=reverse&tag=sectionlist/groups/influenza/search/?sort=modifiedDate&sortDirection=reverse&tag=sectionSectionsCustomTagSidebarCustomTagSidebar?sort=modifiedDate&sortDirection=reverse&tag=section0/groups/influenza/sidebar/CustomTagSidebarmodifiedDate5CustomTagSidebarreversesectionSectionscustom/groups/influenza/search/index.rss?tag=hotlist/groups/influenza/search/?tag=hotWhat’s HotHotListHot!?tag=hot20/groups/influenza/sidebar/HotListrambaut2009-07-08 14:53:06+00:002009-07-08 14:53:06updated10Updated links to final versionlycett2009-07-01 05:34:14+00:002009-07-01 05:34:14updated9rambaut2009-06-12 07:40:19+00:002009-06-12 07:40:19updated8Added tag - hotrambaut2009-06-12 07:40:17+00:002009-06-12 07:40:17addTag7rambaut2009-06-12 07:40:08+00:002009-06-12 07:40:08updated6rambaut2009-06-12 07:27:06+00:002009-06-12 07:27:06updated5rambaut2009-06-12 07:17:50+00:002009-06-12 07:17:50updated4rambaut2009-06-12 07:15:25+00:002009-06-12 07:15:25updated3rambaut2009-06-12 07:15:17+00:002009-06-12 07:15:17updated2First additionrambaut2009-06-12 07:13:50+00:002009-06-12 07:13:50created1wiki2009-07-08T14:53:06+00:00groups/influenza/wiki/1b2c3FalseH1N1: Origins and evolution of the current epidemic/groups/influenza/wiki/1b2c3/H1N1_Origins_and_evolution_of_the_current_epidemic.htmlAndrew Rambaut10 updatesH1N1: Origins and evolution of the current epidemic
Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic
Gavin J. D. Smith1, Dhanasekaran Vijaykris...Falserambaut2009-07-08T14:53:06+00:00Shanghai/71T/09 does not have PB2-627Klycett2009-06-28 16:11:04+00:002009-06-28 16:11:04updated7lycett2009-06-28 16:06:53+00:002009-06-28 16:06:53updated6lycett2009-06-19 07:03:23+00:002009-06-19 07:03:23updated5Added tag - hotlycett2009-06-19 07:03:22+00:002009-06-19 07:03:22addTag4A/Shanghai/71T/2009 & PB2-627Klycett2009-06-18 22:40:51+00:002009-06-18 22:40:51updated3PB2-627K on A/Shanghai/71T/2009lycett2009-06-18 22:34:10+00:002009-06-18 22:34:10updated2First additionlycett2009-06-18 22:12:59+00:002009-06-18 22:12:59created1wiki2009-06-28T16:11:04+00:00groups/influenza/wiki/8bc58FalsePB2-627K & PB2-701N in A/Shanghai/71T/2009/groups/influenza/wiki/8bc58/PB2627K__PB2701N_in_AShanghai71T2009.htmlSam Lycett7 updatesPB2-627K & PB2-701N in A/Shanghai/71T/2009
Notes on A/Shanghai/71T/2009
by Sam Lycett
Update 28 June 09
The sequences A/Shanghai/71T/2009 were updated on 24...Falselycett2009-06-28T16:11:04+00:00comas2009-05-10 21:28:29+00:002009-05-10 21:28:29updated6comas2009-05-10 21:27:07+00:002009-05-10 21:27:07updated5comas2009-05-10 21:26:45+00:002009-05-10 21:26:45updated4Added tag - hotcomas2009-05-10 21:26:43+00:002009-05-10 21:26:43addTag3comas2009-05-10 21:26:25+00:002009-05-10 21:26:25updated2First additioncomas2009-05-10 21:13:16+00:002009-05-10 21:13:16created1wiki2009-05-10T21:28:29+00:00groups/influenza/wiki/aba06FalseUpdated Median Joining network (HA genes) - 8 May 2009 - Iñaki Comas/groups/influenza/wiki/aba06/Updated_Median_Joining_network_HA_genes__8_May_2009__Iaki_Comas.htmlIñaki Comas6 updatesUpdated Median Joining network (HA genes) - 8 May 2009 - Iñaki Comas
Here is an update of the median joining network with all swine samples available at GISAID as of 8 of May. The alignment was provided by Oliver Pyb...Falsecomas2009-05-10T21:28:29+00:00pybus2009-05-08 15:45:54+00:002009-05-08 15:45:54updated6pybus2009-05-06 19:32:17+00:002009-05-06 19:32:17updated5Added tag - selectionpybus2009-05-06 18:30:11+00:002009-05-06 18:30:11addTag4Added tag - hotpybus2009-05-06 18:29:59+00:002009-05-06 18:29:59addTag3pybus2009-05-06 18:29:25+00:002009-05-06 18:29:25updated2First additionpybus2009-05-06 18:18:26+00:002009-05-06 18:18:26created1wiki2009-05-08T15:45:54+00:00groups/influenza/wiki/87337FalseAmino acid changes in HA, NA & MP leading to outbreak 6 May 2009 - Oliver Pybus and Samir Bhatt/groups/influenza/wiki/87337/Amino_acid_changes_in_HA_NA__MP_leading_to_outbreak_6_May_2009__Oliver_Pybus_and_Samir_Bhatt.htmlOliver Pybus6 updatesAmino acid changes in HA, NA & MP leading to outbreak 6 May 2009 - Oliver Pybus and Samir Bhatt
So, I hear you ask, what amino acid changes have occurred along those long branches leading to the outbreak strain?
Right, we've gone and ...Falsepybus2009-05-08T15:45:54+00:00comas2009-05-08 12:59:46+00:002009-05-08 12:59:46updated7rambaut2009-05-08 11:40:12+00:002009-05-08 11:40:12updated6Added tag - genetic structurecomas2009-05-08 10:49:11+00:002009-05-08 10:49:11addTag5Added tag - transmissioncomas2009-05-08 10:48:52+00:002009-05-08 10:48:52addTag4comas2009-05-08 10:45:56+00:002009-05-08 10:45:56updated3Added tag - hotcomas2009-05-08 10:44:35+00:002009-05-08 10:44:35addTag2First additioncomas2009-05-08 10:37:39+00:002009-05-08 10:37:39created1wiki2009-05-08T12:59:46+00:00groups/influenza/wiki/8e6d0FalseMedian joining network (HA gene) - 7May2009 - Iñaki Comas/groups/influenza/wiki/8e6d0/Median_joining_network_HA_gene__7May2009__Iaki_Comas.htmlIñaki Comas7 updatesMedian joining network (HA gene) - 7May2009 - Iñaki Comas
I have created a median joining network with the HA sequences present in the NCBI as of 7May2009. This includes 49 isolates from different locat...Falsecomas2009-05-08T12:59:46+00:00hot/groups/influenza/search/index.rss?sort=modifiedDate&kind=all&sortDirection=reverse&excludePages=wiki/welcomelist/groups/influenza/search/?sort=modifiedDate&kind=all&sortDirection=reverse&excludePages=wiki/welcomeRecent ChangesRecentChangesListUpdates?sort=modifiedDate&kind=all&sortDirection=reverse&excludePages=wiki/welcome0/groups/influenza/sidebar/RecentChangesListmodifiedDateallRecent ChangesRecentChangesListUpdateswiki/welcomeNo recent changes.reverse5search
Comments
Eddie Holmes
May 1, 2009
Hi Philippe,
Good stuff. It is important to be very careful with this analysis because few Mexican sequences are currently in the public domain.
Cheers,
Eddie
Oliver Pybus
May 1, 2009
Hi Philippe - as the method is unpublished, many won't be aware of it. Can you give a short explanation of how it works?