First screen: Masking your sequences?
TFBSsearch searches for transcription factor binding sites in areas of complete identity between two or more aligned sequences, or in single sequences. Highly conserved features, such as coding exons or repeats should be masked to reduce the number of false positive hits. A GFF file should be available for each sequence, containing features that need to be masked.
If you wish exons to be masked then you should selected the 'Mask features' box and select the number of sequences present in your multiple sequence alignment.
Second screen: Set options for TFBSsearch.
Search using IUPAC strings.
NOTE there should be no spaces in the list. Only the following IUPAC characters are allowed:
A Adenine C Cytosine G Guanine T Thymine M AMino (A or C) R PuRine (A or G) W Weak (A or T) S Strong (C or G) Y PYrimidine (C or T) K Keto (G or T) V Not T (A or C or G) H Not G (A or C or T) D Not C (A or G or T) B Not A (C or G or T) N ANy (A or C or G or T)
Alternatively Searches may also be defined by a file containing a
carriage return separated list of IUPAC strings. This option is useful
if you periodically search for the same patterns.
NOTE the last IUPAC string should be on the last line!
Select NWM accession number(s).
TRANSFAC accession numbers may also be entered as a carriage return seperated list file.
A list of TRANSFAC accession numbers and their name can be viewed via this [LINK].
Select format of output.
Unaligned numbering is used as default. NOTE that SynPlot converts the unaligned numbering from a GFF file to plot features, therefore unaligned numbering should be selected if the GFF files are going to be used with SynPlot.
The 'aligned' output reports the feature in the global alignment position, i.e. gaps '-' are respected.
Selecting a reference sequence.
Defaults to first sequence in the multiple fasta file.
Advanced output options:
Select a name for GFF 'feature' column.
Select to ensure the same motif is found in all sequences of the
For example: IUPAC string = NGGAW Alignment = AGGAT Pattern found AGGAA Pattern NOT found! AGGAT AGGAT
Default is not set.
Select conserved range (deviation from exact alignment).
Unless you are searching for a long and not very degenerate motif (i.e. one that will not occur often by chance), it does not make much sense to set x more than a few bases (or even use this option at all). However, if set at 1 or 2, it will allow small mis- alignments to be ignored.
Select sequences to exclude from input file, OR leave blank to select all.
> human ..... > mouse ..... > dog ..... > rat .....
and use the option with 'mouse,dog' then TFBSsearch will only look for motifs that are conserved between human and rat. Note, however, that the gaps generated by the original 4-way alignment will be preserved and that this will likely give you a different output to a TFBSsearch search of a straight 2-way human-rat alignment.
Search using an IUPAC pattern.
This will search for an ETS or GATA site, then 8-12 bases, followed by a second ETS or GATA site, then either 8-12 or 18-22 bases, then an EBOX site.
Search using an NWM pattern.
Select a threshold for a NWN search.