Finally you can unleash the full potential of your ChIP-Seq data in a quick and easy way. With BxChIPSeq 2.0, you can focus on the biology without worrying about hardware, software, algorithms. Best of all, all the powerful analysis results are easily understandable and there is no steep learning curve.
BxChIPSeq 2.0 helps you to extract the rich biological information in your sequencing run. You can expand your spreadsheet to include much more:
With BxChIPSeq 2.0, all you need to do is to send us your raw sequence data, and within a week you will have access to all the analysis mentioned above for your data in a secure website. You can log into your webpage anytime from anywhere, and you can share the results with your team members and collaborators by giving them access to your webpage.
Simple Pricing: Each ChIP-Seq Analysis is only $199.00, about 10% of what you have already spent on running ChIP experiments and generating sequencing data. With this small investment, you can delve into deeper layers of your data, and easily get 5x or 10X more biological insights from your ChIP-Seq experiments.
With BxChIPSeq 2.0 service, you can access all the data analysis outputs from a secure webpage built from your raw ChIP-Seq data. You can display your data in UCSC genome browser, view DNA motifs, and identify enriched biological functions and pathways. Click each of the tabs below to learn more about the data outputs.
It's very useful to visualize your ChIP-Seq data in a genome browser. You can confirm binding events by examining sequence tag coverage from IP and control data, and you can review peaks in the genomic context of genes, conservation, expression, and other biological information.
With BxChIPSeq 2.0 service, you can go to UCSC browser to view your data with a single click. UCSC browser will load the appropriate file directly from your data webpage, so you don't need to upload a large file as custom track to UCSC browser. You can also share the link with your collaborators for them to view the data directly.
Review ChIP-Seq data in UCSC Browser
In this example, we are showing the sequence coverage (normalized tag counts) from ChIP sample and the peak calls as two custom tracks. Here you can browse the data along with other useful tracks provided by UCSC, like genes, conservation with other species, repeats, SNPs, and many more.
You can add all your data to UCSC genome browser by clicking one link at a time from the webpage for your own data.
Another popular genome browser is the Integrative Genomics Viewer (IGV) from the Broad Institute. Many researchers like to use IGV because it is faster for large data sets, and researchers can view raw sequence reads with it.
We provide output files (TDF file for sequence coverage, bed file for peaks) that can be loaded into IGV.
One great use of ChIP-Seq data is to perform motif search to identify DNA sequences that might be responsible for the factor to occupy this region. These DNA sequences are commonly called motifs or cis-elements. Motif search will also shed light on other factors that may work together with the transcription factor you used in ChIP, because those binding site will also be enriched in the peaks due to co-regulation.
BxChIPSeq 2.0 service generates motif search results, and listed the results in a webpage where you can easily browse and search for more information.
Motif Report
In this case, we search for motifs from a ChIP-Seq experiment for TCF7L2 transcription factor in HepG2 cell line. The top motif actually matches to the known motif for tcf3, another factor in the same family. A few other motifs also rank high, suggesting possible co-regulation of those other factors (FOXA1, HNF1A, GATA1, etc) with TCF7L2.
For each ChIP-Seq experiment, researchers can identify many genes that are regulated by the factor (or histone modification) due to the fact that peaks occur within the promoters of these genes. A natural next step is to see if there are common themes for these genes like biological function, cellular location, or protein domains. This kind of information can shed light on the biological role of the factor in the tissue tested.
BxChIPSeq 2.0 will search for enrichment of multiple commonly used functional categories, including Gene Ontology, KEGG Pathways, Interpro, wikipathways, etc. The reports can be accessed from the webpage, or downloaded for viewing in Excel.
Gene Ontology Report
IEnriched Biological Processes (from Gene Ontology) for target genes from a ChIP-Seq experiment for TCF7L2 transcription factor in HepG2 cell line.
A typical ChIP-Seq experiment will report many peaks around the genome. Depending on the transcription factor, the peaks may occur mostly at promoters, or can be located at exon, intron, UTRs, CpG island, or intergenic regions.
BxChIPSeq 2.0 will perform search for enriched genomic features and create webpage and text report.
Despite all the graphic reports, many researchers still need a comprehensive list of the peaks from ChIP-Seq experiment to view in Excel. BxChIPSeq 2.0 creates a detailed peak report that contain a plethora of information to help researchers analyze the data. As a friendly reminder, it's always useful to use the list in combination with genome browser and other reports provided by BxChIPSeq to get the most out of your data.
To demonstrate the capabilities of BxChIPSeq 2.0, we downloaded two sets of ChIP-Seq experiment data from NIH SRA Database; one uses illumine sequencing for transcription factor binding, the other uses SOLiD sequencing for histone modification.
TCF7L2 transcription factor in HepG2 cell line. The experiment contains two technical replicates for ChIP experiments, and one input control.
Histone H3K4me3 modification in mouse brain. Brain tissue from 10 week male BABL/c mouse was used in the study. The experiment contains a single run for ChIP experiment and no input control.
Experiment | Control | ChIPSeq Name | SRA Accession | # of Spots | # of bases |
H3K4me3_Brain | N/A | H3K4me3_Brain | SRX119340 | 52,457,979 | 2.6G |
1. Review Summary |
Example of BxChIPSeq 2.0 Output
BxChIPSeq 2.0 generates a webpage for each ChIP-Seq experiment, plus an extra page for control sample so you can view control tracks in genome browsers as well. Let's use the TCF7L2A page as an example.
We have put notes in red to help you get started.
2. Tutorial: Review data in UCSC Genome Browser |
Review ChIP-Seq data in UCSC Browser
If you want to see the sequence tag coverage and peak calls for TCF7L2A ChIP experiments in UCSC genome browser, just click the "Review in UCSC Genome browser" links. Be patient, as it may take some time for UCSC browser to load the file from your data page, especially for the large sequence coverage file. Once the files are loaded, you will have two custom tracks, one for sequence tag coverage, the other for peaks. Now you can go to any genomic region, or search for your favorite gene within UCSC genome browser. You can also add other annotation tracks hosted at UCSC.
Review ChIP-Seq data in UCSC Browser with Default Settings (Input Track too high due to auto-scaling)
Now if you also want to display the input channel in the browser, you need to go back to home page, open the input webpage, and click the "Review in UCSC Genome browser" link under sequence coverage files. Wait for the browser to load the file, and now you have both tracks.
But wait, it looks like there are many peaks in the input channel! By careful examination, you will see that this is because UCSC auto scaled the sequence tag coverage tracks, and this makes the noise from input channel artificially tall in the display.
Open Configure Display Settings Window for Custom Track
So what you need is to use the same display range for the two tracks. To do this, move mouse over the ChIP track, right click (or Control+Click in a Mac), and select configure. A new window will pop up, just enter 35 for the max vertical viewing and hit ok.
Review ChIP-Seq data in UCSC Browser with Correct Settings (Same Display Range for ChIP and Input Tracks)
Next do the same for the input channel to set the vertical viewing range to 0-35. Now you have set the vertical display range to be the same for ChIP and Input tracks, and the data will be comparable between the two tracks. You can clearly see the very strong peak in ChIP track, and almost no signal from input control.
Now you can add more tracks or change the view, and enjoy exploring your ChIP-Seq data in UCSC genome browser.
3. Tutorial: Review data in IGV |
Integrative Genomics Viewer (IGV) is adopted by many researchers working with next-gen sequencing data due to its speed and capability to handle large data sets. BxChIPSeq provides output files (TDF file for sequence coverage, bed file for peaks) that can be loaded into IGV.
Step 1. Install IGV. Go to IGV website, register for a free account with your email address, and now you can download and install it to your computer. Launch IGV with 750MB will be enough for most users. But if you computer can handle it, launch with more memory to make it run faster.
Step 2. Download the TDF files for sequence coverage files and the bed files for peaks. Save them to a folder you can easily locate. You may need to move the files from the default place where your browser saves files to your destination folder.
Step 3. Launch IGV, select the appropriate genome build (e.g. hg19 or mm9) that matches your data. You can find the genome build information in the webpage for your ChIP-Seq data.
Open Set Data Range Option in IGV
Step 5. Adjust display range. We need to make the display range the same for all ChIP and Input tracks.
Set Data Range for Sequence Tag Count Tracks
In the Data Range window, enter 35 as the maximum. The sequence tag count file has been normalized to 10 million tags for all experiments, and we have found that 35 is a good starting point. However, you may want to increase this to 100-200 for very tall peaks, or decrease this number to ~10 to view weak peaks better.
View ChIP-Seq Data in IGV with Same Data Range for All Tracks
Step 6, now you can enter a genomic range, a gene or gene name in the search box in IGV. For example, for the demo ChIP-Seq experiment data for TCF7L2 transcription factor in HepG2 cell line, you can enter gene name VAV3 to see three strong binding sites at or near this gene.
With IGV, you have many options to display your data. Please see the IGV's User Guide for more information.
4. View the annotated peaks file in Excel and find all target genes |
This tutorial can also be applied to other text files (e.g. gene ontology output files) from the BxChIPSeq output.
To learn more about the columns for the annotated peak file, see the Peak Report for details.
Import Peak Report Text File to Excel
Step 1. Save the text file to your local drive. You can do this by right click (or control click in Mac) the Annotated Peak File link and choose "save link as" or "save target as". Remember where you saved the file.
Turn on AutoFilter in Excel
Step 3. Turn on Auto filter in Excel.
Now you have the annotated peak file opened in Excel, let's try to make it easier to use. Turn on auto filter to access many useful features quickly.
To turn on autofilter, select all the cells, and hit Ctrl-Shift-L.
Create Custom AutoFilter to Select Promoters
Step 4. Find all target genes whose promoters are occupied by the factor
With auto filter, you can do a lot of sorting and filtering in Excel. Here we will show you how to quickly identify all the target genes for a factor.
In the peak report file, if a peak falls near a promoter of a gene, it is listed in the annotation field. To filter for peaks that fall into promoters, do the following for column H.
A. To learn more about some of the tools and technologies related to BxChIPSeq 2.0, please check these references.
A. It's very simple. After we receive your order, we will give you instructions to ftp your raw sequence data to us. Once we receive the data, we will build a secure website containing all analysis results from your data within a week. You will receive a link and password to view your data.
A. Typically researchers send us the raw fastq files. We can also take sequence read archive data format (.sra), or aligned SAM files from program like Bowtie.
A. One year. After one year, you can pay a nominal fee to keep the data on the website, which is convenient if you often use the link to load custom track to UCSC genome browser.
A. Absolutely. You can download all the data to your local drive. The advantage of the website is that you can access it from any computer, anywhere with internet connection.
A. Yes, you can simply create a new analysis in the Sample Submission Form, with drug treated as ChIP run, and untreated ChIP data as control run.
A. Only you have the URL and passwords to your data webpage. You can share the user name and password with your team members or collaborators. If you really want to delete the data on the website, please inform us after you've downloaded a local copy.
1. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol.2009;10(3):R25. Epub 2009 Mar 4. PubMed PMID: 19261174; PubMed Central PMCID:PMC2690996.
Bowtie is a fast tool to align sequence reads to the genome.
2. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D.The human genome browser at UCSC. Genome Res. 2002 Jun;12(6):996-1006. PubMed PMID: 12045153; PubMed Central PMCID: PMC186604.
UCSC Genome Browser is the most popular tool to view most genomes.
3. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C,Singh H, Glass CK. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010 May 28;38(4):576-89. PubMed PMID: 20513432; PubMed Central PMCID: PMC2898526.
There are many tools for peak finding in ChIP-Seq data. Homer stands out as it provides one of the most comprehensive feature set and detailed annotations of the peaks and genes.
4. Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nat Biotechnol. 2011 Jan;29(1):24-6.PubMed PMID: 21221095.
Integrative Genomics Viewer (IGV) is adopted by many researchers working with next-gen sequencing data due to its speed and capability to handle large data sets.