The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
This document should contain all you need to get started measuring tree distances with ‘Quartet’. If you get stuck, please let me know so I can improve this documentation.
Instructions for loading phylogenetic trees into R can be found in a separate vignette. For these examples, we’ll enter two simple trees by hand:
tree1 <- ape::read.tree(text = '(A, ((B, (C, (D, E))), ((F, G), (H, I))));')
tree2 <- ape::read.tree(text = '(A, ((B, (C, (D, (H, I)))), ((F, G), E)));')
We can calculate distances between pairs of trees using the ‘Quartet’ package.
First we’ll install the package. We can either install the stable version from the CRAN repository:
install.packages('Quartet')
or the development version, from GitHub – which will contain the latest features but may not be as extensively tested:
devtools::install_github('ms609/Quartet')
Then we’ll load the package into R’s working environment:
library('Quartet')
Now the package’s functions are available within R. Let’s proceed to calculate some tree distances.
Calculating the distance between two trees is a two stage process. For a quartet distance, we first have to calculate the status of each quartet:
statuses <- QuartetStatus(tree1, tree2)
Then we convert these counts into a distance metric (or similarity measure) that suits our needs – perhaps the Quartet Divergence:
QuartetDivergence(statuses, similarity = FALSE)
## [1] 0.6031746
We can calculate all similarity metrics at once using:
SimilarityMetrics(statuses, similarity = TRUE)
## DoNotConflict ExplicitlyAgree StrictJointAssertions
## [1,] 0.3968254 0.3968254 0.3968254
## SemiStrictJointAssertions SymmetricDifference MarczewskiSteinhaus
## [1,] 0.3968254 0.3968254 0.2475248
## SteelPenny QuartetDivergence SimilarityToReference
## [1,] 0.3968254 0.3968254 0.3968254
It can be instructive to visualize how each split in the tree is contributing to the quartet similarity:
VisualizeQuartets(tree1, tree2)
Rather than using quartets, we might want to use partitions as the basis of our comparison:
SimilarityMetrics(SplitStatus(tree1, tree2))
## DoNotConflict ExplicitlyAgree StrictJointAssertions
## [1,] 0.3333333 0.3333333 0.3333333
## SemiStrictJointAssertions SymmetricDifference MarczewskiSteinhaus
## [1,] 0.3333333 0.3333333 0.2
## SteelPenny QuartetDivergence SimilarityToReference
## [1,] 0.3333333 0.3333333 0.3333333
If you have more than two trees to compare, you can send a list of
trees (class: list
or multiPhylo
) to the
distance comparison function.
You can calculate the similarity between one tree and a forest of other trees:
library('TreeTools', quietly = TRUE, warn.conflicts = FALSE)
oneTree <- CollapseNode(as.phylo(0, 11), 14)
twoTrees <- structure(list(bal = BalancedTree(11), pec = PectinateTree(11)),
class = 'multiPhylo')
status <- SharedQuartetStatus(twoTrees, cf = oneTree)
QuartetDivergence(status)
## bal pec
## 0.4939394 0.6272727
Or between one tree and (itself and) all other trees in the forest:
forest <- as.phylo(0:5, 11)
names(forest) <- letters[1:6]
status <- SharedQuartetStatus(forest)
QuartetDivergence(status)
## a b c d e f
## 1.0000000 0.9757576 0.9757576 0.9333333 0.9121212 0.9333333
Or between each pair of trees in a forest:
status <- ManyToManyQuartetAgreement(forest)
QuartetDivergence(status, similarity = FALSE)
## a b c d e f
## a 0.00000000 0.02424242 0.02424242 0.06666667 0.08787879 0.06666667
## b 0.02424242 0.00000000 0.02424242 0.08787879 0.06666667 0.06666667
## c 0.02424242 0.02424242 0.00000000 0.08484848 0.08484848 0.04242424
## d 0.06666667 0.08787879 0.08484848 0.00000000 0.04242424 0.04242424
## e 0.08787879 0.06666667 0.08484848 0.04242424 0.00000000 0.04242424
## f 0.06666667 0.06666667 0.04242424 0.04242424 0.04242424 0.00000000
Or between one list of trees and a second:
status <- TwoListQuartetAgreement(forest[1:4], forest[5:6])
QuartetDivergence(status, similarity = FALSE)
## e f
## a 0.08787879 0.06666667
## b 0.06666667 0.06666667
## c 0.08484848 0.04242424
## d 0.04242424 0.04242424
“Quartet” can compare trees of different sizes or with non-identical sets of taxa. Quartets pertaining to a leaf that does not occur in one tree are treated as unresolved.
treeAG <- PectinateTree(letters[1:7])
treeBI <- PectinateTree(letters[2:9])
treeEJ <- PectinateTree(letters[5:10])
par(mfrow = c(1, 3), mar = rep(0.3, 4), cex = 1)
plot(treeAG); plot(treeBI); plot(treeEJ)
QuartetState(letters[1:4], treeAG) # 3: C is closest to D
## [1] 3
QuartetState(letters[1:4], treeBI) # 0: unresolved in this tree
## [1] 0
# Calculate status for all leaves observed in trees: here, A..I
QuartetStatus(treeAG, treeBI, nTip = TRUE)
## N Q s d r1 r2 u
## [1,] 252 126 15 0 20 55 36
# Calculate status for specified number of leaves
# Here, we have ten taxa A..J, but J does not occur in either of these trees
QuartetStatus(treeAG, treeBI, nTip = 10)
## N Q s d r1 r2 u
## [1,] 420 210 15 0 20 55 120
# Compare a list of trees with different numbers of leaves to a reference
QuartetStatus(c(treeAG, treeBI, treeEJ), cf = treeAG, nTip = TRUE)
## N Q s d r1 r2 u
## [1,] 420 210 35 0 0 0 175
## [2,] 420 210 15 0 55 20 120
## [3,] 420 210 0 0 15 35 160
# Compare all pairs of trees in a list.
# "u" shows how many possible quartets are unresolved in both trees
ManyToManyQuartetAgreement(c(treeAG, treeBI, treeEJ), nTip = TRUE)[, , "u"]
## [,1] [,2] [,3]
## [1,] 175 120 160
## [2,] 120 140 130
## [3,] 160 130 195
To calculate how many quartets are unique to a certain tree (akin to
the partitionwise equivalent ape::prop.clades
), use:
interestingTree <- as.phylo(42, 7)
referenceTrees <- list(BalancedTree(7), PectinateTree(7))
status <- CompareQuartetsMulti(interestingTree, referenceTrees)
status['x_only']
= 23 quartets are resolved in a certain
way in interestingTree
, but not resolved that way in any
referenceTrees
.
You may wish to:
Read more about Quartet distances
Review alternative distance measures and corresponding functions
Interpret or contextualize tree distance metrics
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.