Seqcol: Sequence Collections

Unique identifiers and lookup service for sequence collections.

Learn more

What is SeqCol?

Seqcol, or Sequence Collections, is a GA4GH-sponsored community effort to standardize unique identifiers for collections of sequences. Seqcol identifiers can be used to identify genomes, transcriptomes, or proteomes -- anything that can be represented as a collection of sequences. The seqcol protocol provides:

  1. implementations of an algorithm for computing sequence identifiers;
  2. a lookup service to retrieve sequences given a seqcol identifier
  3. programmatic approach to assessing compatibility among sequence collections.

Read the complete specification



Data analysts

Uniquely identify the sequences you use with persisent identifiers

Software developers

Use Seqcol identifiers to embed persistent information in your tools about what genome was used in an analysis.

Workflow systems

Use our APIs to retrieve metadata for sequences you use.