It looks like you're new here. If you want to get involved, click one of these buttons!

- All Categories 2.3K
- Chat 499
- Study Groups 18
- Petri Nets 9
- Epidemiology 3
- Leaf Modeling 1
- Review Sections 9
- MIT 2020: Programming with Categories 51
- MIT 2020: Lectures 20
- MIT 2020: Exercises 25
- MIT 2019: Applied Category Theory 339
- MIT 2019: Lectures 79
- MIT 2019: Exercises 149
- MIT 2019: Chat 50
- UCR ACT Seminar 4
- General 67
- Azimuth Code Project 110
- Statistical methods 3
- Drafts 2
- Math Syntax Demos 15
- Wiki - Latest Changes 3
- Strategy 113
- Azimuth Project 1.1K
- - Spam 1
- News and Information 147
- Azimuth Blog 149
- - Conventions and Policies 21
- - Questions 43
- Azimuth Wiki 708

Options

I have not been following the blogs on network theory carefully. However I just saw that in Experiments in Markov chains John is doing something which looks (from a suitable distance) very like Bayesian phylogenetic analysis using a Markov Chain Monte Carlo (MCMC) technique.

There are 15 binary trees (with labelled tips) which represent the possible evolutionary relationships between 4 species. The MCMC technique uses (a deliberately designed) Markov Chain which visits these. One possible set of transitions is given by 'nearest neighbour interchanges'. I have added a picture at the bottom of the page showing the resulting graph of transitions.

## Comments

Another vaguely connected thing is that the, for a directed acyclic graph (which occurs when the nodes come from a

partial order, and probably just needs to be apreorder) the diagram of subsets (as shown on that page) is the Hasse diagram. The diagram tends to be more interesting when some of the elements aren't linked by edges so there's more structure in the Hasse diagram. There are some interesting algorithms one can do on the Hasse diagram, although the size blows up to the unusable very quickly. One area where this comes up is in "incomplete preferences" and to a lesser extent task execution order planning.`Another vaguely connected thing is that the, for a directed acyclic graph (which occurs when the nodes come from a _partial order_, and probably just needs to be a _preorder_) the diagram of subsets (as shown on that page) is the [Hasse diagram](http://en.wikipedia.org/wiki/Hasse_diagram). The diagram tends to be more interesting when some of the elements aren't linked by edges so there's more structure in the Hasse diagram. There are some interesting algorithms one can do on the Hasse diagram, although the size blows up to the unusable very quickly. One area where this comes up is in "incomplete preferences" and to a lesser extent task execution order planning.`

Right now I'm fascinated by a specific Markov chain describing the ways a molecule with this geometry can rearrange itself:

The transition graph is highly symmetric:

This will eventually become a blog post in my networks series, just a fun example of the general theory.

I'm definitely looking for more practical examples from a wide range of subjects - so thanks, Graham!

David wrote:

True! And the Desargues graph is the Hasse diagram coming from just the 2- and 3-element subsets of a 5-order set, partially ordered by inclusion. You can also see it as a Kneser graph.

There's an absurd amount of beautiful mathematical structure in this example - it's even related to grand unified theories in particle physics. I've been trying to get away from overly cute stuff like this, trying to plunge myself into the messy muck of reality, but I'm having some trouble: abstract math keeps chasing after me. In fact, the graph Graham uploaded is related to the 'pentagon identity' in category theory... pretty scary.

`Right now I'm fascinated by a specific Markov chain describing the ways a molecule with this geometry can rearrange itself: <img src = "http://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Trigonal-bipyramidal-3D-balls.png/200px-Trigonal-bipyramidal-3D-balls.png" alt = ""/> The transition graph is highly symmetric: <img src = "http://upload.wikimedia.org/wikipedia/commons/thumb/2/2e/DesarguesGraph.svg/300px-DesarguesGraph.svg.png" alt = ""/> This will eventually become a blog post in my networks series, just a fun example of the general theory. I'm definitely looking for more practical examples from a wide range of subjects - so thanks, Graham! David wrote: > Another vaguely connected thing is that the, for a directed acyclic graph (which occurs when the nodes come from a partial order, and probably just needs to be a preorder) the diagram of subsets (as shown on that page) is the Hasse diagram. True! And the Desargues graph is the Hasse diagram coming from just the 2- and 3-element subsets of a 5-order set, partially ordered by inclusion. You can also see it as a [Kneser graph](http://en.wikipedia.org/wiki/Kneser_graph). There's an absurd amount of beautiful mathematical structure in this example - it's even related to grand unified theories in particle physics. I've been trying to get away from overly cute stuff like this, trying to plunge myself into the messy muck of reality, but I'm having some trouble: abstract math keeps chasing after me. In fact, the graph Graham uploaded is related to the 'pentagon identity' in category theory... pretty scary. <img width = "300" src = "http://www.londonmet.ac.uk/library/x86175_8.jpg" alt = ""/>`

By the way, Graham, I don't know why there are just 15 trees in your picture.

If I keep the 4 species at the tips of the trees in the same order, I get just 5 trees - those are the corners of the pentagon shown above. This is an example of an 'associahedron' . Mathematicians also study it when we replace the number 4 with some other number. Here's the associahedron for trees with 5 tips:

If I allow the 4 species at the tips of the trees to be permuted arbitrarily, I get 5 × 4! = 120 trees. These are the corners of a shape that's an example of a 'permutoassociahedron'.

But you're just allowing 3 permutations, for a total of 5 × 3 = 15 trees.

Oh, I should probably be able to guess this - you're not really working with planar trees, for one thing.

Do you know if biologists have pondered the general situation where the 4 species are replaced by a larger number of species? For mathematicians, this would be a lot of fun: associahedra and permutoassociahedra and their ilk are really fascinating.

`By the way, Graham, I don't know why there are just 15 trees in your picture. If I keep the 4 species at the tips of the trees in the same order, I get just 5 trees - those are the corners of the pentagon shown above. This is an example of an 'associahedron' . Mathematicians also study it when we replace the number 4 with some other number. Here's the associahedron for trees with 5 tips: <img src = "http://faculty.tnstate.edu/sforcey/tamari_sm.gif" alt = ""/> If I allow the 4 species at the tips of the trees to be permuted arbitrarily, I get 5 × 4! = 120 trees. These are the corners of a shape that's an example of a 'permutoassociahedron'. But you're just allowing 3 permutations, for a total of 5 × 3 = 15 trees. Oh, I should probably be able to guess this - you're not really working with planar trees, for one thing. Do you know if biologists have pondered the general situation where the 4 species are replaced by a larger number of species? For mathematicians, this would be a lot of fun: associahedra and permutoassociahedra and their ilk are really fascinating.`

I presume the biologically interesting event is the "thing speciates into X and Y", and in particular I'd imagine there's no difference between that and the event "thing speciates into Y and X" (ie, the internal nodes are "commutative") so some of the 4!=24 permutations are equivalent. I can't immediately think of the way to count those for the 4 element case.

Ok, here's an attempt. There are three "fully symmetrical" trees, which up to equivalences can be generated by "what thing is the closest relative of the tomato" (3 choices), since the unordred pair that goes on the other tips is then determined. Then there's the unbalanced trees, which can be generated by specifying what goes on the two "longest" branches (4 times 3 = 12 choices) since the unordered pair that goes on the final split is then determined. So that gives 15 possibilities.

`I presume the biologically interesting event is the "thing speciates into X and Y", and in particular I'd imagine there's no difference between that and the event "thing speciates into Y and X" (ie, the internal nodes are "commutative") so some of the 4!=24 permutations are equivalent. I can't immediately think of the way to count those for the 4 element case. Ok, here's an attempt. There are three "fully symmetrical" trees, which up to equivalences can be generated by "what thing is the closest relative of the tomato" (3 choices), since the unordred pair that goes on the other tips is then determined. Then there's the unbalanced trees, which can be generated by specifying what goes on the two "longest" branches (4 times 3 = 12 choices) since the unordered pair that goes on the final split is then determined. So that gives 15 possibilities.`

There are a lot of different ways of labelling (and therefore counting) binary trees. For biologists, what matters is the evolutionary relationships. So they don't care about left-right flips of the two subtrees at any node. There are internal three nodes in a tree with 4 tips, so by orienting the tree (giving left, right labels to the subtrees at all nodes) you could turn my 15 trees into $15*2*2*2 = 120$.

Another way of counting trees the biologists' way is to count the number of ways you can combine n things using a binary operation which is commutative but not associative. You can express evolutionary relationships in text like this:

A,(B,(C,D)) = A,(B,(D,C)) = (B,(D,C)),A = ...

(AB),(CD) = (DC),(BA) = ...

Biologists usually work with much more than 4 species. 50 might be typical. I can tell you the number of trees on n tips as biologists count them: $3*5*7*...*(2n-3)$

I can tell you that the nearest neighbour interchange allows you to get from any tree to any other for any n. I don't remember reading about associahedra and permutoassociahedra (or know what they are) but probably someone has done work on them.

I hope I've got all that right, it is easy to get confused. This is a nice article on the maths of biological trees, which I consult when I feel confused:

http://arxiv.org/abs/0803.0153

I also recommend this book: "Inferring phylogenies" by Felsenstein.

`There are a lot of different ways of labelling (and therefore counting) binary trees. For biologists, what matters is the evolutionary relationships. So they don't care about left-right flips of the two subtrees at any node. There are internal three nodes in a tree with 4 tips, so by orienting the tree (giving left, right labels to the subtrees at all nodes) you could turn my 15 trees into $15*2*2*2 = 120$. Another way of counting trees the biologists' way is to count the number of ways you can combine n things using a binary operation which is commutative but not associative. You can express evolutionary relationships in text like this: A,(B,(C,D)) = A,(B,(D,C)) = (B,(D,C)),A = ... (AB),(CD) = (DC),(BA) = ... Biologists usually work with much more than 4 species. 50 might be typical. I can tell you the number of trees on n tips as biologists count them: $3*5*7*...*(2n-3)$ I can tell you that the nearest neighbour interchange allows you to get from any tree to any other for any n. I don't remember reading about associahedra and permutoassociahedra (or know what they are) but probably someone has done work on them. I hope I've got all that right, it is easy to get confused. This is a nice article on the maths of biological trees, which I consult when I feel confused: [http://arxiv.org/abs/0803.0153](http://arxiv.org/abs/0803.0153) I also recommend this book: "Inferring phylogenies" by Felsenstein.`

David,

Glad you noticed the balanced and unbalanced distinction! Your counting looks OK too.

Among the 15 trees here, 12 are unbalanced (3:1 or 1:3 split at the root, where I've now started distinguishing left and right), 3 are balanced (2:2 split at the root). So 80% are unbalanced. If you generate trees using a Yule or standard birth-death process, you find that you get 66.7% unbalanced. 3:1, 2:2, 1:3 are all equally likely.

What about real phylogenetic trees? Turns out to be somewhere between, maybe around 75%. Finding a nice mathematical model which make the right amount of unbalance is a problem that has been bothering me for the last couple if years...

`David, Glad you noticed the balanced and unbalanced distinction! Your counting looks OK too. Among the 15 trees here, 12 are unbalanced (3:1 or 1:3 split at the root, where I've now started distinguishing left and right), 3 are balanced (2:2 split at the root). So 80% are unbalanced. If you generate trees using a Yule or standard birth-death process, you find that you get 66.7% unbalanced. 3:1, 2:2, 1:3 are all equally likely. What about real phylogenetic trees? Turns out to be somewhere between, maybe around 75%. Finding a nice mathematical model which make the right amount of unbalance is a problem that has been bothering me for the last couple if years...`

Thanks, David and Graham!

Okay, got it. Makes perfect sense. I'm sure pure mathematicians must have pondered this, but I mainly know other variants, which have been incredibly important in my work...

The vertices of the $n$th associahedron are the ways of combining $n$ things

written out in orderusing an operation that's not associative. (And not commutative, but that doesn't even come into play.)So, the 4th associahedron has 5 vertices:

$A(B(C D)), A((B C)D), (A B)(C D), (A(B C))D, ((A B)C)D $

which correspond to the trees at the corners of this pentagon:

Ignore the trees at the edges on the edges and tree in the middle, unless you're interested in species that can speciate 3 or 4 at at time! These fancier trees are very important in the associahedron game, but I'm only telling you about how the vertices work.

The 5th associahedron has 14 vertices, like this:

The next one has 42, and so on - these are the "Catalan numbers", which show up all over the place in math. They're great, and I could talk your ear off about them...

But anyway, this stuff is largely irrelevant to phylogenetic trees, except that they're both special case of some bigger theory I only partially understand.

I'll try to learn a bit about the case that matters here.

`Thanks, David and Graham! > Another way of counting trees the biologists' way is to count the number of ways you can combine n things using a binary operation which is commutative but not associative. Okay, got it. Makes perfect sense. I'm sure pure mathematicians must have pondered this, but I mainly know other variants, which have been incredibly important in my work... > I don't remember reading about associahedra ... The vertices of the $n$th associahedron are the ways of combining $n$ things _written out in order_ using an operation that's not associative. (And not commutative, but that doesn't even come into play.) So, the 4th associahedron has 5 vertices: $A(B(C D)), A((B C)D), (A B)(C D), (A(B C))D, ((A B)C)D $ which correspond to the trees at the corners of this pentagon: <img width = "400" src = "http://www.londonmet.ac.uk/library/x86175_8.jpg" alt = ""/> Ignore the trees at the edges on the edges and tree in the middle, unless you're interested in species that can speciate 3 or 4 at at time! These fancier trees are very important in the associahedron game, but I'm only telling you about how the vertices work. The 5th associahedron has 14 vertices, like this: <img width = "400" src = "http://faculty.tnstate.edu/sforcey/tamari_sm.gif" alt = ""/> The next one has 42, and so on - these are the "[Catalan numbers](http://en.wikipedia.org/wiki/Catalan_number)", which show up all over the place in math. They're great, and I could talk your ear off about them... But anyway, this stuff is largely irrelevant to phylogenetic trees, except that they're both special case of some bigger theory I only partially understand. I'll try to learn a bit about the case that matters here.`

I have bumped into the Catalan numbers several times. The ways of counting trees are related like

$3.5.7 ... (2n-3) 2^{n-1} = C_{n-1} n!$

"evolutionary trees when oriented = associahedron vertices when tip labels are permuted."

`I have bumped into the Catalan numbers several times. The ways of counting trees are related like $3.5.7 ... (2n-3) 2^{n-1} = C_{n-1} n!$ "evolutionary trees when oriented = associahedron vertices when tip labels are permuted."`

That's very helpful. I'm not sure what "oriented" means here, though.

`That's very helpful. I'm not sure what "oriented" means here, though.`

An evolutionary tree like A,(B,(C,D)) is the same thing when reflected about each comma, so

A,(B,(C,D)) = A,(B,(D,C)) = (B,(D,C)),A = ...

'Orienting' means specifying a particular choice for each reflection. n species means n-1 commas so $2^{n-1}$ choices.

Maybe it is useful to think of evolutionary trees as trees in the sense of graph theory: an evolutionary tree is a connected acyclic graph in which every vertex has degree one or three, the vertices of degree one are labelled, and one of them is distinguished as the root.

Are you going to say 'forgetful functor' soon?

`An evolutionary tree like A,(B,(C,D)) is the same thing when reflected about each comma, so A,(B,(C,D)) = A,(B,(D,C)) = (B,(D,C)),A = ... 'Orienting' means specifying a particular choice for each reflection. n species means n-1 commas so $2^{n-1}$ choices. Maybe it is useful to think of evolutionary trees as trees in the sense of graph theory: an evolutionary tree is a connected acyclic graph in which every vertex has degree one or three, the vertices of degree one are labelled, and one of them is distinguished as the root. Are you going to say 'forgetful functor' soon?`

I will try to restrain myself.

Thanks, I think I almost get it... but now I can't think out loud without breaking my pledge.

`I will try to restrain myself. Thanks, I think I almost get it... but now I can't think out loud without breaking my pledge.`

There is a Sage graph tutorial which shows the breath of Sage support for both graphs and what you can do with them

I intended my next Sage tutorial to be about graphs and I've started it, but if you want me to address some particular topic more in the tutorial ill listen

`There is a [Sage graph tutorial](http://www-sop.inria.fr/members/Nathann.Cohen/tut/Graphs/) which shows the breath of Sage support for both graphs and what you can do with them I intended my next Sage tutorial to be about graphs and I've started it, but if you want me to address some particular topic more in the tutorial ill listen`

Staffan,

The R code below makes a 15 by 15 matrix which is the adjacency matrix of the graph of evolutionary trees. Perhaps you can make it into a graph in Sage, and do something with it, eg use g.automorphism_group.

`Staffan, The R code below makes a 15 by 15 matrix which is the adjacency matrix of the graph of evolutionary trees. Perhaps you can make it into a graph in Sage, and do something with it, eg use g.automorphism_group. ~~~~ z <- matrix(0, nrow=5, ncol=5) a <- z b <- z for (r in 1:5) { for ( c in 1:5) { d <- (r-c) %% 5 if (d == 1) a[r,c] <- 1 if (d == 2) b[r,c] <- 1 if (d == 3) b[r,c] <- 1 if (d == 4) a[r,c] <- 1 } } abz <- cbind(a,b,z) baz <- cbind(b,z,a) zab <- cbind(z,a,b) M <- rbind(abz, baz, zab) M ~~~~`

Ive added the code to the currently public worksheet, which Ill use as a base when I do the tutorial during this week. There are Desargues graphs,n-cubes, random graphs and After your code I added a KneserGraph(5,2)

`Ive added the [code to the currently public worksheet](), which Ill use as a base when I do the tutorial during this week. There are Desargues graphs,n-cubes, random graphs and After your code I added a KneserGraph(5,2)`