The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.

Circular genome clustering

Tathagata Debnath and Joe Song

Updated: 2021-07-27; 2020-11-29; 2020-09-05. Created: 2020-08-07

Optimal versus heuristic cluster borders on CpG sites of a circular bacterial genome

The fast optimal circular clustering (FOCC) (Debnath and Song 2021) and the heuristic repeated \(K\)-means circular clustering (HEUC) algorithms are applied on the CpG sites of the Candidatus Carsonella ruddii genome (GenBank accession number CP019943.1). Both algorithms clustered the CpG sites into 14 groups, as shown in the figure below.

The clusters obtained by FOCC algorithm are more compact and justifiable as compared to the HEUC ones. The cluster border between the C8 and C9 clusters of the optimal clustering are more subjectively justifiable as compared to the border between C4 and C8 clusters of the heuristic clustering outcome. The cluster borders are pointed by orange arrows inside the circular genome. A fixed seed for random number generation is used to force \(K\)-means to always return the same results.

Therefore, the advantage of optimal clustering over the heuristic clustering algorithm is evident in this example representing practical applications.

References

Debnath, Tathagata, and Mingzhou Song. 2021. “Fast Optimal Circular Clustering and Applications on Round Genomes.” IEEE/ACM Transactions on Computational Biology and Bioinformatics. https://doi.org/10.1109/TCBB.2021.3077573.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.