Cetacean Genomes Data and Analysis Resources
July 18, 2022
Raw data, project information, genome assemblies, and some genome analysis resources.
Genomic data, assemblies, and some analysis resources are freely available, but sometimes difficult to find. Below are links to the primary data and metadata from the three primary organizations generating reference quality cetacean genomes (Vertebrate Genomes Project (VGP), Darwin Tree of Life (DToL), DNAzoo), and the associated projects from these organizations in public genome databases (NCBI, Ensembl, and the UCSC Genome Browser).
The VGP has also developed a cloud-based genome assembly pipeline based on Galaxy, with tutorials on how to assemble reference genomes from the types of data used by VGP and DToL.
Genome Centers Generating Cetacean Genome Assemblies (access to raw data):
- VGP GenomeArk: https://genomeark.s3.amazonaws.com/index.html?prefix=species
- Darwin Tree of Life data portal: https://portal.darwintreeoflife.org/data
- Darwin Tree of Life Genome Notes: https://wellcomeopenresearch.org/gateways/treeoflife/darwintreeoflife
- DNAzoo: https://www.dnazoo.org/assemblies
(search by taxonomic group ‘cetacea’) - DNAzoo NCBI bioproject (SRA archives): https://www.ncbi.nlm.nih.gov/bioproject?LinkName=biosample_bioproject&from_uid=16895766
Genome Databases and Multi-genome Projects:
- VGP NCBI Project: https://www.ncbi.nlm.nih.gov/bioproject/489243
- VGP Ensembl: http://projects.ensembl.org/vgp/
- VGP UCSC Genome Browser: https://hgdownload.soe.ucsc.edu/hubs/VGP/
Genome Assembly and Analysis Web-based Resources (free):
- VGP assembly pipeline (Galaxy cloud): https://assembly.usegalaxy.eu
- Genome completeness (BUSCO) web-interface: https://gvolante.riken.jp/index.html