Genomic analysis of LexA binding reveals the permissive nature of the Escherichia coli genome and identifies unconventional target sites.;Wade JT, Reppas NB, Church GM, Struhl K;Genes & development 2005 Nov 1;
19(21):2619-30
[16264194]
ChIP-chip peak sequence passed to MEME. Motif used with ScanACE to search.
ChIP assay conditions
E. coli strains MG1655 and MG1655 lexA1 were used for ChIP experiments. Cells were grown to mid-exponential phase (OD650 = 0.3-0.6) in LB.
ChIP notes
Cells were grown in appropriate media, and formaldehyde was added to a final concentration of 1%. After 20 min of incubation, glycine was added to a final concentration of 0.5 M, and cells were harvested by centrifugation and washed once with Tris-buffered saline (pH 7.5). Cells were resuspended in 500 μL of lysis buffer (10 mM Tris at pH 8.0, 20% sucrose, 50 mM NaCl, 10 mM EDTA, 4 mg/mL lysozyme) and incubated at 37°C for 30 min. Five-hundred microliters of immunoprecipitation (IP) buffer (50 mM HEPES-KOH at pH 7.5, 150 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% sodium deoxycholate, 0.1% SDS) and PMSF (final concentration 1 mM) were added to the cell extract, and DNA was sheared by sonication to an average size of ∼500 bp. Insoluble cellular material was removed by microcentrifugation for 10 min, and the supernatant was transferred to a fresh tube. Proteins were immunoprecipitated by diluting a fraction of the cross-linked cell extract with IP buffer to a final volume of 800 μL. This was then incubated with 20 μL of Protein A-Sepharose beads (Amersham-Pharmacia) and either no antibody, RNAP β subunit mouse monoclonal (NeoClone), LexA rabbit polyclonal antibody (Upstate), or Gal4 DNA-binding domain rabbit polyclonal antibody (Upstate) for 90 min at room temperature with gentle mixing. Samples were then washed twice with IP buffer, once with IP buffer + 500 mM NaCl, once with wash buffer (10 mM Tris-HCl at pH 8.0, 250 mM LiCl, 1 mM EDTA, 0.5% Nonidet-P40, 0.5% sodium deoxycholate), and once with TE (pH 7.5). Immunoprecipitated complexes were eluted by incubation of beads with elution buffer (50 mM Tris-HCl at pH 7.5, 10 mM EDTA, 1% SDS) at 65°C for 10 min. Immunoprecipitated samples and the corresponding “input” sample were decross-linked by incubation for 2 h at 42°C and for 6 h at 65°C in 0.5× elution buffer + 0.8 mg/mL Pronase. DNA was purified using a PCR purification kit (QIAGEN). All ChIPs were performed at least twice.
Regulated genes for each binding site are displayed below. Gene regulation diagrams
show binding sites, positively-regulated genes,
negatively-regulated genes,
both positively and negatively regulated
genes, genes with unspecified type of regulation.
For each indvidual site, experimental techniques used to determine the site are also given.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.
The principle of ChIP-chip is simple. The first step is to cross-link the protein-DNA complex. This is done using a fixating agent, such as formaldehyde. The cross-linking can later be reversed with heat. Cross-linking kills the cell, giving a snapshot of the bound TF at a given time. The cell is then lysed, the DNA sheared by sonication and the chromatin[2] (TF-DNA complexes) is pulled down using an antibody (i.e. immunoprecipitated). If an antibody for the TF is available, then it is used; otherwise, the TF is tagged with an epitope targeted by commercially available antibodies (the latter option is cheaper, but runs the risk of altering the TF's functionality). Cross-linking is then reversed to free the bound DNA, which is then amplified, labeled with a fluorophore and dumped onto a DNA-array. The scanned array reveals the genomic regions bound by the TF. The resolution is around ~500 bp as a result of the sonication step.
In motif discovery, we are given a set of sequences that we suspect harbor binding sites for a given transcription factor. A typical scenario is data coming from expression experiments, in which we wish to analyze the promoter region of a bunch of genes that are up- or down-regulated under some condition. The goal of motif discovery is to detect the transcription factor binding motif (i.e. the sequence “pattern” bound by the TF), by assuming that it will be overrepresented in our sample of sequences. There are different strategies to accomplish this, but the standard approach uses expectation maximization (EM) and in particular Gibbs sampling or greedy search. Popular algorithms for motif discovery are MEME, Gibbs Motif Sampler or CONSENSUS. More recently, motif discovery algorithms that make use of phylogenetic foot-printing (the idea that TF-binding site will be conserved in the promoter sequences for the same gene in different species) have become available. These are not usually applied to complement experimental work, but can be used to provide a starting point for it. Popular algorithms include FootPrinter and PhyloGibbs.
Once the binding motif for a TF is known, this motif (which essentially defines a pattern) can be used to scan sequences in order to search for putative TF-binding site. This is useful, for instance, when trying to identify TF-binding site in ChIP-chip data. Searching for TF-binding site can be done in numerous ways. The most basic method is consensus search, in sequences are scored according to how many mismatches they have with the consensus sequence for the motif. A more elaborate way of searching involves using regular expressions, which allow to search for more loosely defined motifs [e.g. C(C/G)AT]. Common algorithms for this type of search include Pattern Locator and the DNA Pattern Find method of the SMS2 suite, but also some word processors. Finally, the mainstream way of conducting TF-binding site search is through the use of position-specific scoring matrices, which basically count the occurrences of each base at each position of the motif and use the inferred frequencies to score candidate sites. Algorithms in this last category include TFSEARCH, FITOM, CONSITE, TESS and MatInspector.