What is SynPlot?

Synplot is an application, written in Perl, for viewing global alignments of genomic DNA sequence. It takes as its input aligned sequences in FASTA format, with gaps in the sequences introduced by the alignment represented by "-" characters. The sequences must therefore be of the same length.

The alignment is used to calculate the percentage identity along the alignment within a sliding window, the width of which can be specified by the user. This information is used to draw a representation of the alignment in postscript format (also available as a pdf). The sequences are rendered as lines interrupted by spaces corresponding to the gaps introduced by the alingment, with a plot of the percentage identity underneath.

Scores are calculated as follows:

             seqA  act-acta-tatc
             seqB  acctagt---anc
             seqC  acc-agt---ann

         identity  3310313000301

This is an example window 13 bp long from a global alignment between three sequences. For each combination of pairs of residues 1 is added to the score for the column if they are identical, and are not a "-" or "n". This gives more emphasis to highly conserved regions. The maximum score attainable scales faster than the number of sequences added. For 4 sequences the maximum score is 6; for 10 sequences it is 45.

         18 / (13 x 3) = 0.462

Features can also be drawn on the sequence lines. This uses a GFF format file representing the annotated genomic sequence, and a configuration file which specifies the color, height and order in which the rectangles representing the features are drawn.

The core SynPlot script was designed by Dr. James Gilbert, a former member of this laboratory who is now located at the Sanger Institute. SynPlot is also available as a command line driven script.

The Postscript format figure is converted to the PDF format using the UNIX Ghostscript utility ps2pdf.

