First screen: Selecting the number of TFBS to analyse.
Option 1: You will need to decide the number of DIFFERENT transcription factor
binding site (TFBS) patterns you want to be present in cluster produced
by the program. For example:
ETS-GATA-EBOX equals 3
ETS-ETS-GATA equals 2
GATA-GATA-GATA equals 1
Option 2: We have incorporated libraries of TFBS positions from three different sources.
You can select select which set of TFBSs to use in your analysis. But only one set per
Second screen: Set options for TFBScluster.
Select TFBS parameters.
In this section you need to:
- Select TFBS from list.
These must all be selected!
Only unique TFBSs should be selected. Unique TFBSs are, for example,
ETS, GATA and EBF. ETS and NNETSNN are the same TFBS. Choosing
the same TFBS merely creates duplicates that are removed in the
- Specify the degree of conservation beyond the 'core' motif.
If you want to start with a larger number of less specific TFBSs
in the analysis choose the 'core' TFBS pattern. For increased
sensitivity, choose those TFBSs with extended conservation.
'Non-exact' patterns allow degenerate IUPAC code positions, for
example W (A or T), to be different in the aligned sequences.
'Exact' patterns must be the same in all aligned sequences.
- Specify the minimum number of occurences for this motif.
This allows you to specify the minimum number of your chosen TFBSs
in the final clusters.
Only consider TFBSs that are ALSO conserved in the following
TFBScluster, by default, will form cluster from all TFBSs conserved between
mouse and human genomes. However, you can also select to only use those
TFBS that are also conserved between mouse and dog, or mouse and opossum.
Select whether to include or exclude clusters containing overlapping
If you wanted to find clusters containing at least two "typeA" and one
"typeB" TFBS, by choosing 'exclude' TFBScluster will not report clusters
where one of the "typeA" sites was overlapped by a "typeB" site. We
have included this option to make the minimum number of TFBS in a cluster
represent 'free' sites that may be bound by their corresponding
transcription factors at the same time.
Specify a minimum cluster size.
This is the minimum range, in nucleotides, between the start of the most
5' TFBS and the end of the most 3' TFBS. When selecting this value
conider the loosest arrangement of your individual TFBSs, for example
the centre of each TFBS separated from the next by 1 helical turn
At present the maximum initial cluster size is 220bp. This is a
theoretical maximum for short range looping in DNA, representing the
distance between nucleosomal linkers (Ringrose and coworkers, 1999, EMBO 18: 6630-6641).
Run the short or long analysis?
- Short analysis.
Works out all possible clusters of the specified size, containing
the minimum number of TFBSs. Overlapping clusters are merged and the
final list (in GFF format) is reported back via a link, sent via email.
- Long analysis.
The short analysis is completed then genes are localised to each of
the clusters. A gene is localised to a cluster if a cluster is present
in an intron or where clusters are within an exon or overlapping
an exon are not reported. Otherwise the closest 5' and 3' genes to the
cluster (within 100kb) are reported. As this analysis can take a long
time, the links to the results are returned by email. A set of
output files have been annotated
to describe the contents. These examples come from the human version of
TFBScluster, but are for all intents and purposes the same.
OPTIONAL CLUSTER CONSTRAINTS.
A) Choose to only search for clusters on a single chromosome or 'all'
B) Choose to retain or reject clusters containing user specified
TFBS represented by IUPAC consensus sequences or patterns of IUPAC
IUPAC consensus sequence, for example, GGAW = GGA[A or T].
Patterns or consensus sequences, for example,
GGAW*1-10*GGAW*1-10*GATA = GGA[A/T] 1 to 10 spaces (nucleotides/gaps)
GGA[A/T] 1 to 10 spaces GATA.
Further information can be found in the instruction page of
Filter candidate genes to only show those expressed in a given tissue.
Select a tissue of interest and fold over median expression using the pop-up menus.
Please choose how you would like to get your results (select 1 method).
We recommend you provide an email address as you will receive your results
automatically. Email addresses will be used for notifying users of completed
analyses and updates to the tool etc. However, if you require anonymity you
can opt to retrieve your own results. This method was only really intended for
the review procedure.