#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Options

# Quantum entropy for biodiversity

Thanks to this post I read:

While I like the paper, it seems to me that it would be more natural to use entropy of density matrix, i.e. $$\frac{1}{1-q} \text{Tr} \left[ \rho^q \right]$$ where $\rho_{ij} = \sqrt{p_i p_j} c_{ij}$, where $p_i$ are normalized densities of populations and $0 \leq c_{ij} = c_{ji} \leq 1$ (and $c_{ii}=1$) are similarities.

The good part that it an entropy (entropy of eigenpopulations?). And it has all desired properties:

• for dissimilar species it is just entropy of species,
• if two species are same (i.e. $c_{ij}=1$ and $c_{ik}=c_{jk}$ for all k) then they are equivalent to a single species,
• decreasing similarity increases entropy,
• mixing two populations yields in higher entropy than average entropy.

Additionally, if we drop the assumption that $c_{ii}=1$ and $c_{ij}\leq 1$ then we can naturally incorporate comparison between different kinds of living creatures (is a single bacteria worth more than a single elephant? I guess not).

In contrast, the ${}^q D^Z$ is less an entropy and more of a Rényi divergence (see arXiv:1206.2459) for probabilities $(p_i)_i$ and $(p_i / (Z p)_i )_i$. However, for me the intuition behind this formula is not clear (except that it has some desired properties... but so does the density matrix entropy).

The bad part is that there are constrains on $c_{ij}$ - so that for any $p_i$ the density matrix is positive semidefinite (i.e. $c$ needs to be a positive semidefinite matrix). (At least to be able to work with real $q$. For natural $q$ one can leverage this requirement.)

But maybe its not bug, but a feature - one of the weaknesses of ${}^q D^Z$ is the lack of constrains (or even - guidelines) in the choice of similarity matrix $Z$. In the case of the density matrix entropy a natural closeness is one yielding from a Gramian matrix.

(As a side note, yet another approach would to use confusion matrix and mutual information; that way we put explicitly that not only information is in the mind of beholder, but so does the similarity - see this SMBC strip.)

(One of my problems is that I have no idea for an example where ${}^q D^Z$ and density matrix diversity are qualitatively different. Especially given the freedom of choice of the similarities. Any pointers?)

• Options
1.

This matrix has a negative eigenvalue, so I guess this is an example where they're qualitatively different.

     [,1] [,2] [,3] [,4]
[1,] 1.00 0.91 0.95 0.99
[2,] 0.91 1.00 0.93 0.98
[3,] 0.95 0.93 1.00 0.95
[4,] 0.99 0.98 0.95 1.00

Comment Source:This matrix has a negative eigenvalue, so I guess this is an example where they're qualitatively different. ~~~~ [,1] [,2] [,3] [,4] [1,] 1.00 0.91 0.95 0.99 [2,] 0.91 1.00 0.93 0.98 [3,] 0.95 0.93 1.00 0.95 [4,] 0.99 0.98 0.95 1.00 ~~~~
• Options
2.
edited March 2014

@Graham Jones - I don't see how it helps. Surely, one gets different numerical values.

But I am interested if there are any properties for $\exp(H_q(\rho))$ that do not hold for ${}^q D^Z$ or vice versa. Especially properties with some practical applications or interpretations (though not only).

And I am mostly interested in restricting matrices to symmetric, with non-negative entries and non-negative eigenvalues.

Comment Source:@Graham Jones - I don't see how it helps. Surely, one gets different numerical values. But I am interested if there are any *properties* for $\exp(H_q(\rho))$ that do not hold for ${}^q D^Z$ or vice versa. Especially properties with some practical applications or interpretations (though not only). And I am mostly interested in restricting matrices to symmetric, with non-negative entries and non-negative eigenvalues.
• Options
3.

Perhaps I misunderstood what you said about the 'bad part' (or misunderstood something else of course). I thought that if $c$ had a negative eigenvalue, then $$\frac{1}{1-q} \text{Tr} \left[ \rho^q \right]$$ would not always be real-valued, while Leinster and Cobbold's definition is always real-valued. I'd call that a 'property' rather than a numerical difference.

From a biological point of view, there is 'natural' choice for $Z$ (or $c$) via the age of the most recent common ancestor, the TMRCA.

Example:

h    c  g
\    / /
\  / /
\/ /
\/


TMRCA(h,c) = 3, TMRCA(h,g) = 4, TMRCA(g,c) = 4

TMRCA is an ultrametric. If we make a similarity matrix from TMRCAs in a sensible way, we will have

$$Z_{ii} = 1$$ $$Z_{ij} \in [0,1) (i \neq j)$$ $$Z_{ij} = Z_{ji}$$ $$Z_{ij} \geq \min(Z_{jk}, Z_{ik})$$ I wonder if such a matrix is always positive semidefinite?

Comment Source:Perhaps I misunderstood what you said about the 'bad part' (or misunderstood something else of course). I thought that if $c$ had a negative eigenvalue, then $$\frac{1}{1-q} \text{Tr} \left[ \rho^q \right]$$ would not always be real-valued, while Leinster and Cobbold's definition is always real-valued. I'd call that a 'property' rather than a numerical difference. From a biological point of view, there is 'natural' choice for $Z$ (or $c$) via the age of the most recent common ancestor, the [TMRCA](http://en.wikipedia.org/wiki/TMRCA). Example: ~~~~ h c g \ / / \ / / \/ / \/ ~~~~ TMRCA(h,c) = 3, TMRCA(h,g) = 4, TMRCA(g,c) = 4 TMRCA is an [ultrametric](http://en.wikipedia.org/wiki/Ultrametric). If we make a similarity matrix from TMRCAs in a sensible way, we will have $$Z_{ii} = 1$$ $$Z_{ij} \in [0,1) (i \neq j)$$ $$Z_{ij} = Z_{ji}$$ $$Z_{ij} \geq \min(Z_{jk}, Z_{ik})$$ I wonder if such a matrix is always positive semidefinite?
• Options
4.

The answer to that question seems to be yes. Something stronger is true, if you believe the following.

For any matrix $X$, let $\min(X)$ be the smallest element in $X$. For any vector $x$, let $s(x)$ be the sum of the elements in $x$.

Theorem: If $Z$ satisfies the properties above, then for any vector $w$ of the right dimension, $w^T Z w \geq \min(Z)s(w)^2$.

Sketch proof: First apply Theorem 17.1.1 in Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology by Dan Gusfield. This implies a particular recursive structure for $Z$. It can decomposed like

A  *
*  B


where $A$ and and $B$ are square matrices with the same ultrametric properties as $Z$, and all the entries in the rectangles marked * are equal to $x$, say, where $x = \min(Z)$. Split $w$ into $u$ and $v$ to match the dimensionalities of $A$ and $B$, and then (using induction)

$$w^T Z w = u^T Au + 2 x s(u)s(v) + v^T Bv \geq x(s(u)+s(v))^2 = \min(Z)s(w)^2 .$$ Sorry to hijack your thread. You can go back to talking about quantum entropy now!

Comment Source:The answer to that question seems to be yes. Something stronger is true, if you believe the following. For any matrix $X$, let $\min(X)$ be the smallest element in $X$. For any vector $x$, let $s(x)$ be the sum of the elements in $x$. Theorem: If $Z$ satisfies the properties above, then for any vector $w$ of the right dimension, $w^T Z w \geq \min(Z)s(w)^2$. Sketch proof: First apply Theorem 17.1.1 in *Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology* by Dan Gusfield. This implies a particular recursive structure for $Z$. It can decomposed like ~~~~ A * * B ~~~~ where $A$ and and $B$ are square matrices with the same ultrametric properties as $Z$, and all the entries in the rectangles marked * are equal to $x$, say, where $x = \min(Z)$. Split $w$ into $u$ and $v$ to match the dimensionalities of $A$ and $B$, and then (using induction) $$w^T Z w = u^T Au + 2 x s(u)s(v) + v^T Bv \geq x(s(u)+s(v))^2 = \min(Z)s(w)^2 .$$ Sorry to hijack your thread. You can go back to talking about quantum entropy now!
• Options
5.
edited March 2014

You can naturally1) create scalar product similarity based on Time to the Most Recent Common Ancestor.

The recipe is the following:

• select time scale $t_0$,
• for two species the similarity is overlap of their branches in the evolutionary tree, weighted by some decaying function of time.

That is,

$$c_{ij}(t_0) = \tfrac{1}{t_0} \int_{0}^{\infty} w(t/t_0) [i(t) = j(t)] dt,$$ which can be expressed as scalar product (vector components are lines in the phylogenetic tree),where $w(x)$ can be (for example):

• indicator function on $[0,1]$ (i.e. we weight time uniformly but set cut-off at time $t=t_0$),
• exponential decay $\exp(-x)$.

Given it is a tree, not any other graph, we have:

• for the indicator function:

$$c_{ij}(t_0) = \max\left(0, 1-\tfrac{TMRCA(i,j)}{t_0}\right),$$

• for the exponential function:

$$c_{ij}(t_0) = \exp\left(-\tfrac{TMRCA(i,j)}{t_0}\right),$$

• and in general:

$$c_{ij}(t_0) = \tfrac{1}{t_0} \int_{TMRCA(i,j)}^{\infty} w(t/t_0) dt.$$ 1) The word natural is tricky. However, up to the choice of $w(x)$ and $t_0$, I have no idea for any other simple and intuitive function from TMRCA to scalar-product similarities.

Comment Source:You can naturally1) create scalar product similarity based on Time to the Most Recent Common Ancestor. The recipe is the following: * select time scale $t_0$, * for two species the similarity is overlap of their branches in the evolutionary tree, weighted by some decaying function of time. That is, $$c_{ij}(t_0) = \tfrac{1}{t_0} \int_{0}^{\infty} w(t/t_0) [i(t) = j(t)] dt,$$ which can be expressed as scalar product (vector components are lines in the phylogenetic tree),where $w(x)$ can be (for example): * indicator function on $[0,1]$ (i.e. we weight time uniformly but set cut-off at time $t=t_0$), * exponential decay $\exp(-x)$. Given it is a tree, not any other graph, we have: * for the indicator function: $$c_{ij}(t_0) = \max\left(0, 1-\tfrac{TMRCA(i,j)}{t_0}\right),$$ * for the exponential function: $$c_{ij}(t_0) = \exp\left(-\tfrac{TMRCA(i,j)}{t_0}\right),$$ * and in general: $$c_{ij}(t_0) = \tfrac{1}{t_0} \int_{TMRCA(i,j)}^{\infty} w(t/t_0) dt.$$ 1) The word *natural* is tricky. However, up to the choice of $w(x)$ and $t_0$, I have no idea for any other simple and intuitive function from TMRCA to scalar-product similarities. Ad. hijacking the thread: well I had another question in mind (i.e. asking about differences between the two different approaches, beyond ones that I already do know), but I've found your "interruption" stimulating. :) (Hence this answer!)
• Options
6.

I was thinking along the same lines for the mapping of TMRCA(i,j) to $c_ij$. For the indicator function, one possibility for $t_0$ would be the TMRCA of all extant species, maybe about 3 billion years (which is more recent than the origin of life). For the exponential, maybe the rate at which the total number of species has grown (approximately exponentially) could be used. That would be $t_0$ around 200 million years. (Very roughly, the number of species goes up by a factor of $e$ every 200My.)

Comment Source:I was thinking along the same lines for the mapping of TMRCA(i,j) to $c_ij$. For the indicator function, one possibility for $t_0$ would be the TMRCA of all extant species, maybe about 3 billion years (which is more recent than the origin of life). For the exponential, maybe the rate at which the total number of species has grown (approximately exponentially) could be used. That would be $t_0$ around 200 million years. (Very roughly, the number of species goes up by a factor of $e$ every 200My.)
• Options
7.
edited March 2014

Thanks for a comment for a default value of $t_0$ (though, I considered it a parameter).

Now I see that there is one more nice choice for the weight, $w(x)=\delta(x-1)$ - i.e. we treat species as indistinguishable iff $\text{TMRCA}(i,j) \leq t_0$. This way, within the density matrix framework, we recover the result form:

Comment Source:Thanks for a comment for a default value of $t_0$ (though, I considered it a parameter). Now I see that there is one more nice choice for the weight, $w(x)=\delta(x-1)$ - i.e. we treat species as indistinguishable iff $\text{TMRCA}(i,j) \leq t_0$. This way, within the density matrix framework, we recover the result form: * Anne Chao, Chun-Huo Chiu, Lou Jost, [Phylogenetic Diversity Measures Based on Hill Numbers](http://dx.doi.org/10.1098/rstb.2010.0272), Philosophical Transactions B (2010)
• Options
8.
edited March 2014

Piotr wrote:

While I like the paper, it seems to me that it would be more natural to use entropy of density matrix...

I hope you noticed that Cobbold and Leinster prove a theorem saying that their quantity ${}^{q}D^Z$ is the only quantity (or more precisely family of quantities, since it depends on $q$) with a few properties they consider reasonable. That's the main point of their paper, I believe.

Comment Source:Piotr wrote: > While I like the paper, it seems to me that it would be more _natural_ to use entropy of density matrix... I hope you noticed that Cobbold and Leinster prove a theorem saying that their quantity ${}^{q}D^Z$ is the only quantity (or more precisely family of quantities, since it depends on $q$) with a few properties they consider reasonable. That's the main point of their paper, I believe.
• Options
9.

John wrote:

I hope you noticed that Cobbold and Leinster prove a theorem saying that their quantity ${}^q D^Z$ is the only quantity (or more precisely family of quantities, since it depends on $q$) with a few properties they consider reasonable.

I saw theorems proving particular properties of ${}^q D^Z$, which were considered desirable for a diversity measure (plus, notes that for particular values of $q$, $Z$ or transformations of the quantity, we get already known entropies or diversity measures).

However, I've failed to see a claim that it is the only quantity satisfying a set of assumptions (let alone the proof). I hope I haven't missed the main point. Could you point to a theorem saying that?

Comment Source:John wrote: > I hope you noticed that Cobbold and Leinster prove a theorem saying that their quantity ${}^q D^Z$ is the only quantity (or more precisely family of quantities, since it depends on $q$) with a few properties they consider reasonable. I saw theorems proving particular properties of ${}^q D^Z$, which were considered desirable for a diversity measure (plus, notes that for particular values of $q$, $Z$ or transformations of the quantity, we get already known entropies or diversity measures). However, I've failed to see a claim that it is the only quantity satisfying a set of assumptions (let alone the proof). I hope I haven't missed the main point. Could you point to a theorem saying that?
• Options
10.
edited April 2014

Piotr wrote:

However, I’ve failed to see a claim that it is the only quantity satisfying a set of assumptions (let alone the proof).

Whoops!

I'm sorry. I just assumed they proved a characterization theorem, since Tom likes to prove such theorems for related quantities, such as diversity measures, Rényi entropies and the like. See these two posts for how all these concepts are connected:

So I'm shocked that they didn't prove such a theorem in this paper, justifying their concept!

Comment Source:Piotr wrote: > However, I’ve failed to see a claim that it is the only quantity satisfying a set of assumptions (let alone the proof). Whoops! <img src = "http://math.ucr.edu/home/baez/emoticons/doh20.gif" alt = ""/> I'm sorry. I just _assumed_ they proved a characterization theorem, since Tom likes to prove such theorems for related quantities, such as diversity measures, R&eacute;nyi entropies and the like. See these two posts for how all these concepts are connected: * [Entropy, Diversity and Cardinality (Part 1)](http://golem.ph.utexas.edu/category/2008/10/entropy_diversity_and_cardinal.html). * [Entropy, Diversity and Cardinality (Part 2)](http://golem.ph.utexas.edu/category/2008/11/entropy_diversity_and_cardinal_1.html) So I'm shocked that they didn't prove such a theorem in this paper, justifying their concept!