ExoLocator is a collection of protein coding exons from vertebrate genomes. It contains exons annotated by the ENSEMBL pipeline, and by our own extended search. We look for ostensibly missing exons by similarity to known exon sequences from taxonomically close species.
In our search we rely on USEARCH, an established CPU based tool, as well as on SW# (pronounced "SW sharp" and available on Github), a program that implements hardware accelerated Smith-Waterman algorithm. It enables us to do dynamic programming search on genomic sequences of lengths that are typically tractable only by heuristic methods.
Collected sets of exons can be used, within the ExoLocator server, to estimate the conservation of residues within orthologous groups, as well as their specialization across paralogous families of proteins.
Citing ExoLocator. If you find ExoLocator useful in your work, please cite its companion publication: Khoo, Aik Aun, et al. "ExoLocator—an online view into genetic makeup of vertebrate proteins." Nucl. Acids Res. (1 January 2014) 42 (D1): D879-D881. doi: 10.1093/nar/gkt1164
Last modified Feb 2014: update to Ensembl v 74, including three new species (sheep, spotted gar and blind cavefish).
A note about sequence visualization in ExoLocator: ExoLocator uses JalView, a Java based browser plug-in for sequence alignment visualization. Without one of the newer versions of Java, the applet will not run, irrespective of the browser. Read more.
A note about alternative splicing information in ExoLocator: ExoLocator currently makes no attempt at predicting alternative splicing events. Rather, in the cases when it is available (for human and mouse genomes), it takes over the information from CCDS project.