Supplementary Data.

Donaldson and Gottgens (2006).

Download CoMoDis Scripts

'CoMoDis scripts and files'


CoModis Examples

STUDY #1: GATA-1-mediated proliferation arrest during erythroid maturation. [PMID: 12832487]
Reference: Rylski M. et. al. (2003) Mol. Cell. Biol. 23:5031-5042.
Gene list(s): GATA-1 targets [ENSMUSG IDs]
GATA-1 repressed [ENSMUSG IDs]
CoMoDis parameters: GATA n0 seed motifs within 20kb of the coregulated genes. 50bp extracted either side of each seed motif.
Results (Target): See detailed results below for 'target' lists.
Ebox motifs detected by DME, GAME, PhyloGibbs and PhyME.
V$SREBP1_02 motif detected by BioProspector, DME and PhyloGibbs.
Homeodomain motifs detected by YMF and PhyME.
SP1 motifs detected by DME, nMICA, PhyloGibbs and PhyME.
Results (Repressed): See detailed results below for 'repressed' lists.
Ets motifs detected by GAME, nMICA and PhyloCon.
REL famly motifs detected by DME and PhyloGibbs.
MEF2 motifs detected by PhyloCon and PhyloGibbs.


STUDY #2: Gene expression analysis of Gata3-/- mice by using cDNA microarray technology. [PMID: 15769480]
Reference: Airik R. et. al. (2005) Life Sci. 76:2559-6258.
Gene list(s): Genes downregulated in E9.5 Gata3 mutant embryos [ENSMUSG IDs]
CoMoDis parameters: GATA n2 seed motifs within 20kb of the coregulated genes. 50bp extracted either side of each seed motif.
Results: Motifs identified by >1 discovery tool were V$NFAT_Q6, V$IRF2_01 and homeodomain family.
V$NFAT_Q6 motif detected by nMICA and PhyloCon.
V$IRF2_01 motif detected by nMICA, PhyloCon, PhyloGibbs and PhyME.
Homeodomain motifs detected by Weeder and YMF.


STUDY #3: In vivo filtering of in vitro expression data reveals MyoD targets. [PMID: 14744113]
Reference: Zhao P. et. al. (2003) C. R. Biol. 326:1049-1065.
Gene list(s): Expression of in vitro upregulated MyoD targets in muscle regeneration.
Late - 3 day peak [ENSMUSG IDs]
Early - 12h-2d peak [ENSMUSG IDs]
CoMoDis parameters: EBOX n3 seed motifs within 20kb of the coregulated genes. 50bp extracted either side of each seed motif.
Results (Late): Motifs identified by >1 discovery tool were Hen-1 (EBOX) and V$ETS2_B.
V$MEF2_01 was found by YMF only.
Hen-1 motif detected by DME, nMICA, PhyloCon and PhyloGibbs.
V$ETS2_B motif detected by PhyloGibbs and PhyME.
Results (Early): Motifs identified by >1 discovery tool were Hen-1 (EBOX).
V$MEF2_01 was found by DME only.
Hen-1 motif detected by PhyloCon and PhyME.


CoModis Detailed Example (Rylski et. al. 2003)

CoMoDis input files
'Target' group
'Repressed' group

CoMoDis summary output files
'Target' group
'Repressed' group

CoMoDis sequence and position output files
'Target' group
'Repressed' group

Motif discovery summary files
'Target' group
'Repressed' group

Motif discovery tool raw output files
Below are the output files generated by the tools used in the motif discovery stage of this study. There are two tables the first for single sequence analyses and the second for orthologous sequence analyses. The HTML links allow the files to be viewed or saved to your computer. PLEASE NOTE: Unless otherwise stated the tools were instructed to look for motifs on both strands of the input sequence.

TABLE 1. Output files from single sequence tools
'Target' motifs 'Repressed' motifs
BioProspector
NOTES:
The tool was run 3 times to ensure all possible motifs were found, as BioProspector uses a Gibbs sampling strategy. The background model was generated from the input sequences.
Motif length 6: Output1 Output2 Output3
Motif length 6: Output1 Output2 Output3
Motif length 8: Output1 Output2 Output3
Motif length 8: Output1 Output2 Output3
DME v1.44
NOTES:
A background model was not selected for this example.
Motif length 6: Output Motif length 6: Output
Motif length 8: Output Motif length 8: Output
GAME
NOTES:
Files contain 5 motifs of all lengths between 4 and 10; 6 was set as the expected length. GAME was given 10 independent runs to ensure the motif with the highest fitness score was found. Only the results with positive fitness values were tested.
Motif length 4-10: Output Motif length 4-10: Output
nMICA v0.7.2
NOTES:
The background model was constructed using 'makemosaicbg', where mosaicClasses=4 and mosaicOrder=2. 'motiffinder' was run with the following flags: numMotifs=10 ensembleSize=200; the targetLength was set to either 6 or 8. 'motifviewer' was used to export a STUBB formatted file containing matrices representing each motif.
Motif length 6: XMS_file STUBB_file Motif length 6: XMS_file STUBB_file
Motif length 8: XMS_file STUBB_file Motif length 8: XMS_file STUBB_file
Weeder
NOTES:
Weeder does not require an expected motif length. The setting was chosen to allow motifs to be present more than once on the same sequence. A 'normal' scan was performed.
Variable motif length: Output Variable motif length: Output
YMF v3 using FindExplanators v1.1.2
NOTES:
'Mus musculus' was chosen as background model. A maximum of two degenerate positions were allowed when searching for motifs of length 6 and 8. The final motifs were taken from the output of 'FindExplanators', selecting 5 output motifs.
Motif length 6: Output Motif length 6: Output
Motif length 8: Output Motif length 8: Output


TABLE 2. Output files from orthologous sequence tools
'Target' motifs 'Repressed' motifs
PhyloCon v3b
NOTES:
We used our own publicaly available web version of this tool that uses a restricted set of parameters. The flag '-s2' was used to select for shorter tighter alignments. '-u2' was selected to ignore 'N' character for masked sequences. An expected motif length is not required. However, the tool was run twice, firstly to search for motifs on both strands of the input sequences seperately, then again but considering both strands as the same sequence.
Both strands of sequence seperately: None Both strands of sequence seperately: Output
Both strands of sequence together: Output Both strands of sequence together: Output
PhyloGibbs v1
NOTES:
We used our own publicaly available web version of this tool that uses a restricted set of parameters. A more comprehensive web version of this tool is available from the authors (see the link page of CoMoDis). The tool was run 3 times to ensure all possible motifs were found, as PhyloGibbs uses a Gibbs sampling strategy. The 'phylohistory' (-g) flag was used to provide a proximity value of 0.5 between human and mouse to a common ancestor. '-D 1' was used to retain the UCSC alignment.
Motif length 6: Output1 Output2 Output3
Motif length 6: Output1 Output2 Output3
Motif length 8: Output1 Output2 Output3
Motif length 8: Output1 Output2 Output3
PhyMe v1.2.1
NOTES:
We used our own publicaly available web version of this tool that uses a restricted set of parameters. The phylogen flat file was as follows: '0 0.5'. The flag '-ot' (the threshold above which motifs are reported) was set to 0.3. 100 randomly selected seeds were used to find each motif '-niter'. For each seed 25 iterations of the EM algorithm were performed '-nseediter'.
Motif length 6: Output Motif length 6: Output
Motif length 8: Output Motif length 8: Output

Valid HTML 4.01! Webmaster.
Last modified: Wednesday 4 October 2006.