Here we’ll use the ex_counts feature table included with
ecodive. It contains the number of observations of each bacterial genera
in each sample. In the text below, you can substitute the word ‘genera’
for the feature of interest in your own data.
Beta diversity is a measure of how different two samples are.
Looking at the counts matrix above, you can easily see
that saliva and gums are similar, while saliva and stool are different.
The different metrics described below quantify that difference, referred
to as the “distance” or “dissimilarity” between a pair of samples. The
distance is 0 for identical samples and 1 for
completely different samples.
The classic algorithms all run in weighted mode by default.
Specifying weighted = FALSE,
e.g. canberra(counts, weighted = FALSE) will switch them to
unweighted mode.
bray_curtis(), canberra(),
euclidean(), gower(), jaccard(),
kulczynski(), manhattan()For the UniFrac algorithms, unweighted_unifrac() is
unweighted and all the others are weighted.
Unweighted: unweighted_unifrac()
Weighted: weighted_unifrac(),
weighted_normalized_unifrac(),
generalized_unifrac(),
variance_adjusted_unifrac()
The default value of pairs=NULL in ecodive’s beta
diversity functions results in the returned all-vs-all distance matrix
being completely filled in.
bray_curtis(counts)
#> Saliva Gums Nose
#> Gums 0.4260870
#> Nose 0.9797101 0.9826087
#> Stool 0.9884058 0.9884058 0.9913043If you are doing a reference-vs-all comparison, you can use the
pairs parameter to skip unwanted calculations and save some
CPU time. The larger the dataset, the more noticeable the improvement
will be.
bray_curtis(counts, pairs = 1:3)
#> Saliva Gums Nose
#> Gums 0.4260870
#> Nose 0.9797101 NA
#> Stool 0.9884058 NA NAThe pairs argument can be:
function(i,j) that returns whether columns
i and j should be compared.Therefore, all of the following are equivalent:
bray_curtis(counts, pairs = 1:3)
bray_curtis(counts, pairs = c(TRUE, TRUE, TRUE, FALSE, FALSE, FALSE))
bray_curtis(counts, pairs = function (i, j) i == 1)The ordering of pairs follows the pairings produced by
combn().
# Column index pairings
combn(ncol(counts), 2)
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1 1 1 2 2 3
#> [2,] 2 3 4 3 4 4
# Sample name pairings
combn(colnames(counts), 2)
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] "Saliva" "Saliva" "Saliva" "Gums" "Gums" "Nose"
#> [2,] "Gums" "Nose" "Stool" "Nose" "Stool" "Stool"So, for instance, to use gums as the reference sample: