The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Importing Genetic Data into Coala

While coala primiary focuses on simulation of data, it also offers to calculate summary statsitcs from real data. This is particularly useful when comparing the statistics of real and simulated data.

Rather than offering functions to import data directly, coala can convert the internal formats of other R packages into its own format. Currently, the PopGenome package is supported, but we plan to support ape and pegas in the future.

Importing Data from PopGenome

PopGenome provides functions for reading various data formats, including vcf and fasta. Please refer to its documentation for detailed instructions. As an example, we will read sequence data from a short fasta file that is included in coala:

suppressPackageStartupMessages(library(PopGenome))
fasta <- system.file("example_fasta_files", package = "coala")
data_pg <- readData(fasta, progress_bar_switch = FALSE)
data_pg <- set.outgroup(data_pg, c("Individual_Out-1", "Individual_Out-2"))
individuals <- list(paste0("Individual_1-", 1:5), paste0("Individual_2-", 1:5))
data_pg <- set.populations(data_pg, individuals)

Here the sequences originate from two population and an outgroup. The outgroup is required for most summary statistics.

We can now convert data_pg using the as.segsites function:

library(coala)
segsites <- as.segsites(data_pg)

Next, we calculate summary statistics using calc_sumstats_from_data:

model <- coal_model(c(5, 5, 2), 1, 25) + 
  feat_mutation(5) +
  feat_outgroup(3) +
  sumstat_sfs(population = 1)
stats <- calc_sumstats_from_data(model, segsites)
stats$sfs

Alternatively, it is also possible to pass the data_pg object directly to calc_sumstats_from_data:

stats <- calc_sumstats_from_data(model, data_pg)
stats$sfs

Please refer to the help pages for as.segsites and calc_sumstats_from_data for additional information.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.