Code
The programs developed during the project are hosted in this project. Each program file contains instruction for users and/or developers. Use cases of individual programs are introduced in Protocols.
No installation is required. Each script or notebook should work out-of-the-box, with or without a few common dependencies. To simplify usage, we recommend using conda to create a virtual Python environment to host dependencies:
conda create -n wol -c conda-forge python=3 scikit-bio scikit-learn seaborn biom-format
source activate wol
-
scripts: Python scripts.
-
notebooks: Jupyter Notebooks.
-
utils: Python modules used by scripts and notebooks.
-
prototypeSelection: Algorithms for “prototype selection”, a statistical process that is central to the genome sampling process of this project.
Some utilities developed during this work have been merged into the latest releases of scikit-bio, an integrated Python package for bioinformatics.