About Ensembl Rapid Release

Genome assembly is cheaper, more accurate and automated than it has ever been. This comes from a combination of more cost-efficient chemistries, new sequencing technologies and better algorithms. With the huge increase in sequencing planned by large biodiversity initiatives such as Darwin Tree of Life, the Vertebrate Genomes Project and the Earth BioGenome Project, in addition to the work of smaller groups and communities, high quality genome assemblies are being produced at an ever-increasing rate.

In order to provide functional annotation on these genomes as quickly as possible, Ensembl Rapid Release runs off a two-week data release cycle. This differs from the release cycle on www.ensembl.org, where releases happen quarterly. This means that instead of a potential 3-6 month wait from the time at which the gene set is complete to when it becomes public, we have now reduced that window to 2-4 weeks (depending on when gene set completion occurs relative to the next Rapid Release cycle).

The key difference to note between a traditional Ensembl release and Ensembl Rapid Release is in terms of data integration and overall functionality. A traditional Ensembl release combines the gene set with comparative data (gene trees, gene names and genome alignments) and, where available, variation and regulation data. Ensembl Rapid Release currently only focuses on the gene set. Similarly, in term of functionality, features like programmatic access and data archiving are not present. These are important limitations to consider when using Ensembl Rapid Release.

Over the coming months we will be adding some of this missing functionality onto the site; in particular key features such as basic homology data, gene names and programmatic access are targets for early inclusion. The goal is to keep the site lightweight, allowing for fast data release while gradually increasing the feature set. Below is a list of available features and features that are currently in development.

Features currently provided

  • Gene annotation
  • Protein feature annotation
  • BLAST functionality
  • File dumps including transcript and protein sequences and the softmasked genome sequence

Features in development

  • Homology data
  • Gene symbol assignment
  • Supplementary gene tracks (e.g. RefSeq annotation where available)
  • REST functionality