Coming to the Eco-Stats Symposium? Would you like to know more about the speakers and their research before coming? We are compiling a reading list of suggested papers - one per speaker - and are holding a discussion group on Fridays 2-3pm to work through the list at UNSW (AGSM Courtyard). If you can't be there in person please join the blog Fridays at 2 (Sydney time) - we will keep an eye on it!
Monday, 24 June 2013
June 28: Anne Chao - a new class of measures of phylogenetic diversity
Just came across this great resource on diversity measures from one of the authors (Lou Jost) of today's paper http://www.loujost.com/Statistics%20and%20Physics/Diversity%20and%20Similarity/DiversitySimilarityHome.htm
One take home message was that Hill numbers, which sounds scary, are actually quite simple to understand - it's a general formula that unites species richness, Shannon diversity, and Simpson's diversity. "Hills not mountains" says Flo!
We noticed last week how the diversity literature is a bit of a "dog's breakfast", lots of different methods and not a lot known about how they relate to each other. A real strength of this paper was that it took some key diversity measures and linked them to each other conceptually - species richness, Shannon and Simpson's diversity and now Faith's Phylogenetic Distance all in the same framework. And making this connection to PD led to a new proposal - just as PD can be thought of as a tree-based version of spp richness, you can construct tree-based versions of Shannon and Simpson diversity.
We discussed the motivation for measuring phylogenetic diversity (a site with a rat and a wombat is more diverse than a site with two species of rat), the replication principle (necessary to ensure that a diversity measure scales correctly if you add clades or compare different sampling scales), and how the idea works (Figure 1a - it might help to stick counts in, e.g. p1=3 individuals, p2=3 individuals, p3=2 individuals, so p2+p3=5, ...)
One key issue Haba identified was the same issue we came across last week - how to account for sampling error. Abundances (and hence the relative frequencies pi) will change under repeated sampling and so how confident can we be in a given estimate of diversity. e.g first entry in Table 2, diversity is 5.402, but +/- what? This (essentially, by "this" I mean statistics) is an important issue not discussed in either of the papers we have looked at, although as I understand it there have been some efforts in the literature...
A final point was what do you do when comparing diversity for two sites which have root nodes that are different lengths from the observed species. (e.g. one site all angiosperms, one site a mixture of angios and gymnosperms). Presumably the trees would need to be the same length for diversity measures to be comparable, so I'd say you should extend the root branch of the shorter tree (for the all angio site) back in time until the total tree lengths match up (i.e. until you reach the node where angios and gymnos join up). Then you use this extended tree to calculate your diversity index (at the all angio site). This will give a different answer to if the root branch were ignored - diversity for a slice along this root branch of the tree is exactly one for all Hill numbers, elsewhere in the tree slices will give you a larger number.
Re sampling error, herewith a very new approach based on..um...Hill numbers...
Chao et al. (in press) Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies http://www.esajournals.org/doi/abs/10.1890/13-0133.1
Just came across this great resource on diversity measures from one of the authors (Lou Jost) of today's paper http://www.loujost.com/Statistics%20and%20Physics/Diversity%20and%20Similarity/DiversitySimilarityHome.htm
ReplyDeleteOne take home message was that Hill numbers, which sounds scary, are actually quite simple to understand - it's a general formula that unites species richness, Shannon diversity, and Simpson's diversity. "Hills not mountains" says Flo!
ReplyDeleteWe noticed last week how the diversity literature is a bit of a "dog's breakfast", lots of different methods and not a lot known about how they relate to each other. A real strength of this paper was that it took some key diversity measures and linked them to each other conceptually - species richness, Shannon and Simpson's diversity and now Faith's Phylogenetic Distance all in the same framework. And making this connection to PD led to a new proposal - just as PD can be thought of as a tree-based version of spp richness, you can construct tree-based versions of Shannon and Simpson diversity.
ReplyDeleteWe discussed the motivation for measuring phylogenetic diversity (a site with a rat and a wombat is more diverse than a site with two species of rat), the replication principle (necessary to ensure that a diversity measure scales correctly if you add clades or compare different sampling scales), and how the idea works (Figure 1a - it might help to stick counts in, e.g. p1=3 individuals, p2=3 individuals, p3=2 individuals, so p2+p3=5, ...)
ReplyDeleteOne key issue Haba identified was the same issue we came across last week - how to account for sampling error. Abundances (and hence the relative frequencies pi) will change under repeated sampling and so how confident can we be in a given estimate of diversity. e.g first entry in Table 2, diversity is 5.402, but +/- what? This (essentially, by "this" I mean statistics) is an important issue not discussed in either of the papers we have looked at, although as I understand it there have been some efforts in the literature...
ReplyDeleteA final point was what do you do when comparing diversity for two sites which have root nodes that are different lengths from the observed species. (e.g. one site all angiosperms, one site a mixture of angios and gymnosperms). Presumably the trees would need to be the same length for diversity measures to be comparable, so I'd say you should extend the root branch of the shorter tree (for the all angio site) back in time until the total tree lengths match up (i.e. until you reach the node where angios and gymnos join up). Then you use this extended tree to calculate your diversity index (at the all angio site). This will give a different answer to if the root branch were ignored - diversity for a slice along this root branch of the tree is exactly one for all Hill numbers, elsewhere in the tree slices will give you a larger number.
ReplyDeleteRe sampling error, herewith a very new approach based on..um...Hill numbers...
ReplyDeleteChao et al. (in press) Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species diversity studies
http://www.esajournals.org/doi/abs/10.1890/13-0133.1