It looks like you're new here. If you want to get involved, click one of these buttons!

- All Categories 2.4K
- Chat 503
- Study Groups 21
- Petri Nets 9
- Epidemiology 4
- Leaf Modeling 2
- Review Sections 9
- MIT 2020: Programming with Categories 51
- MIT 2020: Lectures 20
- MIT 2020: Exercises 25
- Baez ACT 2019: Online Course 339
- Baez ACT 2019: Lectures 79
- Baez ACT 2019: Exercises 149
- Baez ACT 2019: Chat 50
- UCR ACT Seminar 4
- General 75
- Azimuth Code Project 110
- Statistical methods 4
- Drafts 10
- Math Syntax Demos 15
- Wiki - Latest Changes 3
- Strategy 113
- Azimuth Project 1.1K
- - Spam 1
- News and Information 148
- Azimuth Blog 149
- - Conventions and Policies 21
- - Questions 43
- Azimuth Wiki 718

Options

Thanks to this post I read:

- T. Leinster, C.A. Cobbold, Measuring Diversity: The Importance of Species Similarity, Ecology (2012), pdf here.

While I like the paper, it seems to me that it would be more *natural* to use entropy of density matrix, i.e.
$$
\frac{1}{1-q} \text{Tr} \left[ \rho^q \right]
$$
where
$\rho_{ij} = \sqrt{p_i p_j} c_{ij}$,
where $p_i$ are normalized densities of populations and $0 \leq c_{ij} = c_{ji} \leq 1$ (and $c_{ii}=1$) are similarities.

The **good part** that it an *entropy* (entropy of *eigenpopulations*?). And it has all desired properties:

- for dissimilar species it is just entropy of species,
- if two species are same (i.e. $c_{ij}=1$ and $c_{ik}=c_{jk}$ for all k) then they are equivalent to a single species,
- decreasing similarity increases entropy,
- mixing two populations yields in higher entropy than average entropy.

Additionally, if we drop the assumption that $c_{ii}=1$ and $c_{ij}\leq 1$ then we can naturally incorporate comparison between different kinds of living creatures (is a single bacteria worth more than a single elephant? I guess not).

In contrast, the ${}^q D^Z$ is less an entropy and more of a Rényi divergence (see arXiv:1206.2459) for probabilities $(p_i)_i$ and $(p_i / (Z p)_i )_i$. However, for me the intuition behind this formula is not clear (except that it has some desired properties... but so does the density matrix entropy).

The **bad part** is that there are constrains on $c_{ij}$ - so that for any $p_i$ the density matrix is positive semidefinite (i.e. $c$ needs to be a positive semidefinite matrix).
(At least to be able to work with real $q$. For natural $q$ one can leverage this requirement.)

But maybe its not bug, but a feature - one of the weaknesses of ${}^q D^Z$ is the lack of constrains (or even - guidelines) in the choice of similarity matrix $Z$. In the case of the density matrix entropy a natural closeness is one yielding from a Gramian matrix.

(As a side note, yet another approach would to use confusion matrix and mutual information; that way we put explicitly that not only *information is in the mind of beholder*, but so does the similarity - see this SMBC strip.)

Any comments?

(One of my problems is that I have no idea for an example where ${}^q D^Z$ and density matrix diversity are qualitatively different. Especially given the freedom of choice of the similarities. Any pointers?)

## Comments

This matrix has a negative eigenvalue, so I guess this is an example where they're qualitatively different.

`This matrix has a negative eigenvalue, so I guess this is an example where they're qualitatively different. ~~~~ [,1] [,2] [,3] [,4] [1,] 1.00 0.91 0.95 0.99 [2,] 0.91 1.00 0.93 0.98 [3,] 0.95 0.93 1.00 0.95 [4,] 0.99 0.98 0.95 1.00 ~~~~`

@Graham Jones - I don't see how it helps. Surely, one gets different numerical values.

But I am interested if there are any

propertiesfor $\exp(H_q(\rho))$ that do not hold for ${}^q D^Z$ or vice versa. Especially properties with some practical applications or interpretations (though not only).And I am mostly interested in restricting matrices to symmetric, with non-negative entries and non-negative eigenvalues.

`@Graham Jones - I don't see how it helps. Surely, one gets different numerical values. But I am interested if there are any *properties* for $\exp(H_q(\rho))$ that do not hold for ${}^q D^Z$ or vice versa. Especially properties with some practical applications or interpretations (though not only). And I am mostly interested in restricting matrices to symmetric, with non-negative entries and non-negative eigenvalues.`

Perhaps I misunderstood what you said about the 'bad part' (or misunderstood something else of course). I thought that if $c$ had a negative eigenvalue, then $$ \frac{1}{1-q} \text{Tr} \left[ \rho^q \right] $$ would not always be real-valued, while Leinster and Cobbold's definition is always real-valued. I'd call that a 'property' rather than a numerical difference.

From a biological point of view, there is 'natural' choice for $Z$ (or $c$) via the age of the most recent common ancestor, the TMRCA.

Example:

TMRCA(h,c) = 3, TMRCA(h,g) = 4, TMRCA(g,c) = 4

TMRCA is an ultrametric. If we make a similarity matrix from TMRCAs in a sensible way, we will have

$$Z_{ii} = 1$$ $$Z_{ij} \in [0,1) (i \neq j)$$ $$Z_{ij} = Z_{ji}$$ $$Z_{ij} \geq \min(Z_{jk}, Z_{ik})$$ I wonder if such a matrix is always positive semidefinite?

`Perhaps I misunderstood what you said about the 'bad part' (or misunderstood something else of course). I thought that if $c$ had a negative eigenvalue, then $$ \frac{1}{1-q} \text{Tr} \left[ \rho^q \right] $$ would not always be real-valued, while Leinster and Cobbold's definition is always real-valued. I'd call that a 'property' rather than a numerical difference. From a biological point of view, there is 'natural' choice for $Z$ (or $c$) via the age of the most recent common ancestor, the [TMRCA](http://en.wikipedia.org/wiki/TMRCA). Example: ~~~~ h c g \ / / \ / / \/ / \/ ~~~~ TMRCA(h,c) = 3, TMRCA(h,g) = 4, TMRCA(g,c) = 4 TMRCA is an [ultrametric](http://en.wikipedia.org/wiki/Ultrametric). If we make a similarity matrix from TMRCAs in a sensible way, we will have $$Z_{ii} = 1$$ $$Z_{ij} \in [0,1) (i \neq j)$$ $$Z_{ij} = Z_{ji}$$ $$Z_{ij} \geq \min(Z_{jk}, Z_{ik})$$ I wonder if such a matrix is always positive semidefinite?`

The answer to that question seems to be yes. Something stronger is true, if you believe the following.

For any matrix $X$, let $\min(X)$ be the smallest element in $X$. For any vector $x$, let $s(x)$ be the sum of the elements in $x$.

Theorem: If $Z$ satisfies the properties above, then for any vector $w$ of the right dimension, $w^T Z w \geq \min(Z)s(w)^2$.

Sketch proof: First apply Theorem 17.1.1 in

Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biologyby Dan Gusfield. This implies a particular recursive structure for $Z$. It can decomposed likewhere $A$ and and $B$ are square matrices with the same ultrametric properties as $Z$, and all the entries in the rectangles marked * are equal to $x$, say, where $x = \min(Z)$. Split $w$ into $u$ and $v$ to match the dimensionalities of $A$ and $B$, and then (using induction)

$$w^T Z w = u^T Au + 2 x s(u)s(v) + v^T Bv \geq x(s(u)+s(v))^2 = \min(Z)s(w)^2 .$$ Sorry to hijack your thread. You can go back to talking about quantum entropy now!

`The answer to that question seems to be yes. Something stronger is true, if you believe the following. For any matrix $X$, let $\min(X)$ be the smallest element in $X$. For any vector $x$, let $s(x)$ be the sum of the elements in $x$. Theorem: If $Z$ satisfies the properties above, then for any vector $w$ of the right dimension, $w^T Z w \geq \min(Z)s(w)^2$. Sketch proof: First apply Theorem 17.1.1 in *Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology* by Dan Gusfield. This implies a particular recursive structure for $Z$. It can decomposed like ~~~~ A * * B ~~~~ where $A$ and and $B$ are square matrices with the same ultrametric properties as $Z$, and all the entries in the rectangles marked * are equal to $x$, say, where $x = \min(Z)$. Split $w$ into $u$ and $v$ to match the dimensionalities of $A$ and $B$, and then (using induction) $$w^T Z w = u^T Au + 2 x s(u)s(v) + v^T Bv \geq x(s(u)+s(v))^2 = \min(Z)s(w)^2 .$$ Sorry to hijack your thread. You can go back to talking about quantum entropy now!`

You can naturally

`1)`

create scalar product similarity based on Time to the Most Recent Common Ancestor.The recipe is the following:

That is,

$$ c_{ij}(t_0) = \tfrac{1}{t_0} \int_{0}^{\infty} w(t/t_0) [i(t) = j(t)] dt, $$ which can be expressed as scalar product (vector components are lines in the phylogenetic tree),where $w(x)$ can be (for example):

Given it is a tree, not any other graph, we have:

$$ c_{ij}(t_0) = \max\left(0, 1-\tfrac{TMRCA(i,j)}{t_0}\right), $$

$$ c_{ij}(t_0) = \exp\left(-\tfrac{TMRCA(i,j)}{t_0}\right), $$

$$ c_{ij}(t_0) = \tfrac{1}{t_0} \int_{TMRCA(i,j)}^{\infty} w(t/t_0) dt. $$

`1)`

The wordnaturalis tricky. However, up to the choice of $w(x)$ and $t_0$, I have no idea for any other simple and intuitive function from TMRCA to scalar-product similarities.Ad. hijacking the thread: well I had another question in mind (i.e. asking about differences between the two different approaches, beyond ones that I already do know), but I've found your "interruption" stimulating. :) (Hence this answer!)

`You can naturally`1)` create scalar product similarity based on Time to the Most Recent Common Ancestor. The recipe is the following: * select time scale $t_0$, * for two species the similarity is overlap of their branches in the evolutionary tree, weighted by some decaying function of time. That is, $$ c_{ij}(t_0) = \tfrac{1}{t_0} \int_{0}^{\infty} w(t/t_0) [i(t) = j(t)] dt, $$ which can be expressed as scalar product (vector components are lines in the phylogenetic tree),where $w(x)$ can be (for example): * indicator function on $[0,1]$ (i.e. we weight time uniformly but set cut-off at time $t=t_0$), * exponential decay $\exp(-x)$. Given it is a tree, not any other graph, we have: * for the indicator function: $$ c_{ij}(t_0) = \max\left(0, 1-\tfrac{TMRCA(i,j)}{t_0}\right), $$ * for the exponential function: $$ c_{ij}(t_0) = \exp\left(-\tfrac{TMRCA(i,j)}{t_0}\right), $$ * and in general: $$ c_{ij}(t_0) = \tfrac{1}{t_0} \int_{TMRCA(i,j)}^{\infty} w(t/t_0) dt. $$ `1)` The word *natural* is tricky. However, up to the choice of $w(x)$ and $t_0$, I have no idea for any other simple and intuitive function from TMRCA to scalar-product similarities. Ad. hijacking the thread: well I had another question in mind (i.e. asking about differences between the two different approaches, beyond ones that I already do know), but I've found your "interruption" stimulating. :) (Hence this answer!)`

I was thinking along the same lines for the mapping of TMRCA(i,j) to $c_ij$. For the indicator function, one possibility for $t_0$ would be the TMRCA of all extant species, maybe about 3 billion years (which is more recent than the origin of life). For the exponential, maybe the rate at which the total number of species has grown (approximately exponentially) could be used. That would be $t_0$ around 200 million years. (Very roughly, the number of species goes up by a factor of $e$ every 200My.)

`I was thinking along the same lines for the mapping of TMRCA(i,j) to $c_ij$. For the indicator function, one possibility for $t_0$ would be the TMRCA of all extant species, maybe about 3 billion years (which is more recent than the origin of life). For the exponential, maybe the rate at which the total number of species has grown (approximately exponentially) could be used. That would be $t_0$ around 200 million years. (Very roughly, the number of species goes up by a factor of $e$ every 200My.)`

Thanks for a comment for a default value of $t_0$ (though, I considered it a parameter).

Now I see that there is one more nice choice for the weight, $w(x)=\delta(x-1)$ - i.e. we treat species as indistinguishable iff $\text{TMRCA}(i,j) \leq t_0$. This way, within the density matrix framework, we recover the result form:

`Thanks for a comment for a default value of $t_0$ (though, I considered it a parameter). Now I see that there is one more nice choice for the weight, $w(x)=\delta(x-1)$ - i.e. we treat species as indistinguishable iff $\text{TMRCA}(i,j) \leq t_0$. This way, within the density matrix framework, we recover the result form: * Anne Chao, Chun-Huo Chiu, Lou Jost, [Phylogenetic Diversity Measures Based on Hill Numbers](http://dx.doi.org/10.1098/rstb.2010.0272), Philosophical Transactions B (2010)`

Piotr wrote:

I hope you noticed that Cobbold and Leinster prove a theorem saying that their quantity ${}^{q}D^Z$ is the only quantity (or more precisely family of quantities, since it depends on $q$) with a few properties they consider reasonable. That's the main point of their paper, I believe.

`Piotr wrote: > While I like the paper, it seems to me that it would be more _natural_ to use entropy of density matrix... I hope you noticed that Cobbold and Leinster prove a theorem saying that their quantity ${}^{q}D^Z$ is the only quantity (or more precisely family of quantities, since it depends on $q$) with a few properties they consider reasonable. That's the main point of their paper, I believe.`

John wrote:

I saw theorems proving particular properties of ${}^q D^Z$, which were considered desirable for a diversity measure (plus, notes that for particular values of $q$, $Z$ or transformations of the quantity, we get already known entropies or diversity measures).

However, I've failed to see a claim that it is the only quantity satisfying a set of assumptions (let alone the proof). I hope I haven't missed the main point. Could you point to a theorem saying that?

`John wrote: > I hope you noticed that Cobbold and Leinster prove a theorem saying that their quantity ${}^q D^Z$ is the only quantity (or more precisely family of quantities, since it depends on $q$) with a few properties they consider reasonable. I saw theorems proving particular properties of ${}^q D^Z$, which were considered desirable for a diversity measure (plus, notes that for particular values of $q$, $Z$ or transformations of the quantity, we get already known entropies or diversity measures). However, I've failed to see a claim that it is the only quantity satisfying a set of assumptions (let alone the proof). I hope I haven't missed the main point. Could you point to a theorem saying that?`

Piotr wrote:

Whoops!

I'm sorry. I just

assumedthey proved a characterization theorem, since Tom likes to prove such theorems for related quantities, such as diversity measures, Rényi entropies and the like. See these two posts for how all these concepts are connected:Entropy, Diversity and Cardinality (Part 1).

Entropy, Diversity and Cardinality (Part 2)

So I'm shocked that they didn't prove such a theorem in this paper, justifying their concept!

`Piotr wrote: > However, I’ve failed to see a claim that it is the only quantity satisfying a set of assumptions (let alone the proof). Whoops! <img src = "http://math.ucr.edu/home/baez/emoticons/doh20.gif" alt = ""/> I'm sorry. I just _assumed_ they proved a characterization theorem, since Tom likes to prove such theorems for related quantities, such as diversity measures, Rényi entropies and the like. See these two posts for how all these concepts are connected: * [Entropy, Diversity and Cardinality (Part 1)](http://golem.ph.utexas.edu/category/2008/10/entropy_diversity_and_cardinal.html). * [Entropy, Diversity and Cardinality (Part 2)](http://golem.ph.utexas.edu/category/2008/11/entropy_diversity_and_cardinal_1.html) So I'm shocked that they didn't prove such a theorem in this paper, justifying their concept!`