Process next-generation sequencing data online

Finally you can unleash the full potential of your ChIP-Seq data in a quick and easy way. With the biologist-friendly web interface powered by BxChIPSeq 2.0, you can focus on the biology without worrying about hardware, software, algorithms. Best of all, all the powerful analysis results are simple to use and there is no steep learning curve.

 

Introducing BxChIPSeq 2.0: Analyzing ChIP-Seq Results Has Never Been This Easy!

Finally you can unleash the full potential of your ChIP-Seq data in a quick and easy way. With the biologist-friendly web interface powered by BxChIPSeq 2.0, you can focus on the biology without worrying about hardware, software, algorithms. Best of all, all the powerful analysis results are simple to use and there is no steep learning curve.

 

What is BxChIPSeq 2.0?

You have spent lots of time to figure out the right conditions and antibody for chromatin immunoprecipitation, and you have paid precious grant money for the sequencing run. What's next? Your ChIP-Seq data contain rich biological information, but you are only looking at the tip of the iceberg if all you have is an Excel spreadsheet listing peaks with some annotations. You can do a lot more, including:

 
   

With BxChIPSeq 2.0, all you need to do is to send us your raw sequence data, and within a week you will have access to all the analysis mentioned above for your data in a secure website. You can log into your webpage anytime from anywhere, and you can share the results with your team members and collaborators by giving them access to your webpage.

And this convenient service is very affordable. The price for each ChIP-Seq analysis is only $199, about 10% of what you have already spent on running ChIP experiments and generating sequencing data. With this small investment, you can delve into deeper layers of your data, and easily get 5x or 10X more biological insights from your ChIP-Seq experiments.

   

Unleash the power of your ChIP-Seq data today!

With BxChIPSeq 2.0 service, you can access all the data analysis outputs from a secure webpage built from your raw ChIP-Seq data. You can display your data in UCSC genome browser, view DNA motifs, and identify enriched biological functions and pathways. Click each of the tabs below to learn more about the data outputs.

Genome Browsers
DNA Binding Motif
Gene Ontology
Genomic Features
Peak Report

Review your ChIP-Seq Data in Genome Browsers

 

It's very useful to visualize your ChIP-Seq data in a genome browser. You can confirm binding events by examining sequence tag coverage from IP and control data, and you can review peaks in the genomic context of genes, conservation, expression, and other biological information.

Compatitable with UCSC Browser

With BxChIPSeq 2.0 service, you can go to UCSC browser to view your data with a single click. UCSC browser will load the appropriate file directly from your data webpage, so you don't need to upload a large file as custom track to UCSC browser. You can also share the link with your collaborators for them to view the data directly.

Review ChIP-Seq data in UCSC Browser

In this example, we are showing the sequence coverage (normalized tag counts) from ChIP sample and the peak calls as two custom tracks. Here you can browse the data along with other useful tracks provided by UCSC, like genes, conservation with other species, repeats, SNPs, and many more.

You can add all your data to UCSC genome browser by clicking one link at a time from the webpage for your own data.

 

Compatitable with Integrative Genomics Viewer (IGV)

Another popular genome browser is the Integrative Genomics Viewer (IGV) from the Broad Institute. Many researchers like to use IGV because it is faster for large data sets, and researchers can view raw sequence reads with it.

We provide output files (TDF file for sequence coverage, bed file for peaks) that can be loaded into IGV.

Review ChIP-Seq data in IGV

In this example, we are showing the sequence coverage (normalized tag counts) from two ChIP samples and the input sample, as well as the peak calls in IGV.

Identify DNA Binding Motifs for Transcription Factors

 

One great use of ChIP-Seq data is to perform motif search to identify DNA sequences that might be responsible for the factor to occupy this region. These DNA sequences are commonly called motifs or cis-elements. Motif search will also shed light on other factors that may work together with the transcription factor you used in ChIP, because those binding site will also be enriched in the peaks due to co-regulation.

BxChIPSeq 2.0 service generates motif search results, and listed the results in a webpage where you can easily browse and search for more information.

Motif Report

In this case, we search for motifs from a ChIP-Seq experiment for TCF7L2 transcription factor in HepG2 cell line. The top motif actually matches to the known motif for tcf3, another factor in the same family. A few other motifs also rank high, suggesting possible co-regulation of those other factors (FOXA1, HNF1A, GATA1, etc) with TCF7L2.

Functional Implication of Genes Regulated by the Factor

 

For each ChIP-Seq experiment, researchers can identify many genes that are regulated by the factor (or histone modification) due to the fact that peaks occur within the promoters of these genes. A natural next step is to see if there are common themes for these genes like biological function, cellular location, or protein domains. This kind of information can shed light on the biological role of the factor in the tissue tested.

BxChIPSeq 2.0 will search for enrichment of multiple commonly used functional categories, including Gene Ontology, KEGG Pathways, Interpro, wikipathways, etc. The reports can be accessed from the webpage, or downloaded for viewing in Excel.

Gene Ontology Report

Enriched Biological Processes (from Gene Ontology) for target genes from a ChIP-Seq experiment for TCF7L2 transcription factor in HepG2 cell line.

Protein Domain Report

Enriched protein domains (from Interpro) for target genes from a ChIP-Seq experiment for TCF7L2 transcription factor in HepG2 cell line.

Where in the Genome Do the Peaks Occur More Often?

 

A typical ChIP-Seq experiment will report many peaks around the genome. Depending on the transcription factor, the peaks may occur mostly at promoters, or can be located at exon, intron, UTRs, CpG island, or intergenic regions.

BxChIPSeq 2.0 will perform search for enriched genomic features and create webpage and text report.

Genomic Feature Report

Enriched genomic features for peaks from a ChIP-Seq experiment for TCF7L2 transcription factor in HepG2 cell line. Here the peaks tend to occur more often at gene rich regions, and promoters, consistent with the TCF7L2's role as a transcription factor.

The List of All Peaks with Detailed Annotations

Despite all the graphic reports, many researchers still need a comprehensive list of the peaks from ChIP-Seq experiment to view in Excel. BxChIPSeq 2.0 creates a detailed peak report that contain a plethora of information to help researchers analyze the data. As a friendly reminder, it's always useful to use the list in combination with genome browser and other reports provided by BxChIPSeq to get the most out of your data.

 

Peak Report with Annotations

Here the peak annotation file is shown open in Excel. The columns can be divided into four major categories, covering peak, annotations, nearest gene and sequence tag count.

Introduction
Tutorial
Result
 

Welcome to Demo Data Page for BxChIPSeq 2.0

 

To demonstrate the capabilities of BxChIPSeq 2.0, we downloaded two sets of ChIP-Seq experiment data from NIH SRA Database; one uses illumine sequencing for transcription factor binding, the other uses SOLiD sequencing for histone modification.

Illumine Sequencing For Transcription Factor Binding

TCF7L2 transcription factor in HepG2 cell line. The experiment contains two technical replicates for ChIP experiments, and one input control.

Study summary: GSE31477: ENCODE Transcription Factor Binding Sites by ChIP-Seq from Stanford/Yale/USC/Harvard (SRP007993)

Instrument model: Illumina Genome Analyzer IIx

Processing pipeline: Base Caller v

Species: Human (hg19)

Notes: ChIP-Seq experiment data for TCF7L2 transcription factor in HepG2 cell line from NIH SRA Database

Experiment Control ChIPSeq Name SRA Accession # of Spots # of bases
TCF7L2A Input TCF7L2A SRR340077 26,710,376 961.6M
TCF7L2B Input TCF7L2B SRR340078 25,005,577 900.2M
Input N/A Input SRR353506 28,007,793 896.2M

SOLiD Sequencing For Histone Modification

Histone H3K4me3 modification in mouse brain. Brain tissue from 10 week male BABL/c mouse was used in the study. The experiment contains a single run for ChIP experiment and no input control.

Instrument model: ABI SOLiD System 3.0

Spot Descriptor: Forward

Species: Mouse (mm9)

Notes: ChIP-Seq experiment data H3K4me3 modification of mouse brain from NIH SRA Database

Experiment Control ChIPSeq Name SRA Accession # of Spots # of bases
H3K4me3_Brain N/A H3K4me3_Brain SRX119340 52,457,979 2.6G

Understanding the BxChIPSeq Report

 
Table of Content
  1. Review Summary
  2. Review data in UCSC Genome Browser
  3. Review data in IGV
  4. Review the annotated peaks in Excel and find all target genes

Tutorial: Review Summary

BxChIPSeq 2.0 generates a webpage for each ChIP-Seq experiment, plus an extra page for control sample so you can view control tracks in genome browsers as well. Let's use the TCF7L2A page as an example.

We have put notes in red to help you get started.

Example of BxChIPSeq 2.0 Output

 
 

Tutorial: Review data in UCSC Genome Browser

If you want to see the sequence tag coverage and peak calls for TCF7L2A ChIP experiments in UCSC genome browser, just click the "Review in UCSC Genome browser" links. Be patient, as it may take some time for UCSC browser to load the file from your data page, especially for the large sequence coverage file. Once the files are loaded, you will have two custom tracks, one for sequence tag coverage, the other for peaks. Now you can go to any genomic region, or search for your favorite gene within UCSC genome browser. You can also add other annotation tracks hosted at UCSC.

Review ChIP-Seq data in UCSC Browser

Now if you also want to display the input channel in the browser, you need to go back to home page, open the input webpage, and click the "Review in UCSC Genome browser" link under sequence coverage files. Wait for the browser to load the file, and now you have both tracks.

But wait, it looks like there are many peaks in the input channel! By careful examination, you will see that this is because UCSC auto scaled the sequence tag coverage tracks, and this makes the noise from input channel artificially tall in the display.

Review ChIP-Seq data in UCSC Browser with Default Settings (Input Track too high due to auto-scaling)

So what you need is to use the same display range for the two tracks. To do this, move mouse over the ChIP track, right click (or Control+Click in a Mac), and select configure. A new window will pop up, just enter 35 for the max vertical viewing and hit ok.

Open Configure Display Settings Window for Custom Track

Configure Display Settings for Custom Track in UCSC Browser

Next do the same for the input channel to set the vertical viewing range to 0-35. Now you have set the vertical display range to be the same for ChIP and Input tracks, and the data will be comparable between the two tracks. You can clearly see the very strong peak in ChIP track, and almost no signal from input control.

Review ChIP-Seq data in UCSC Browser with Correct Settings (Same Display Range for ChIP and Input Tracks)

Now you can add more tracks or change the view, and enjoy exploring your ChIP-Seq data in UCSC genome browser.

 

Tutorial: Review data in IGV

Integrative Genomics Viewer (IGV) is adopted by many researchers working with next-gen sequencing data due to its speed and capability to handle large data sets. BxChIPSeq provides output files (TDF file for sequence coverage, bed file for peaks) that can be loaded into IGV.

Step 1. Install IGV. Go to IGV website, register for a free account with your email address, and now you can download and install it to your computer. Launch IGV with 750MB will be enough for most users. But if you computer can handle it, launch with more memory to make it run faster.

Step 2. Download the TDF files for sequence coverage files and the bed files for peaks. Save them to a folder you can easily locate. You may need to move the files from the default place where your browser saves files to your destination folder.

Step 3. Launch IGV, select the appropriate genome build (e.g. hg19 or mm9) that matches your data. You can find the genome build information in the webpage for your ChIP-Seq data.

Step 4. Load the TDF and bed files.

Load the TDF and bed files to IGV Brower

Step 5. Adjust display range. We need to make the display range the same for all ChIP and Input tracks.

Open Set Data Range Option in IGV

In the Data Range window, enter 35 as the maximum. The sequence tag count file has been normalized to 10 million tags for all experiments, and we have found that 35 is a good starting point. However, you may want to increase this to 100-200 for very tall peaks, or decrease this number to ~10 to view weak peaks better.

Set Data Range for Sequence Tag Count Tracks

Step 6, now you can enter a genomic range, a gene or gene name in the search box in IGV. For example, for the demo ChIP-Seq experiment data for TCF7L2 transcription factor in HepG2 cell line, you can enter gene name VAV3 to see three strong binding sites at or near this gene.

View ChIP-Seq Data in IGV with Same Data Range for All Tracks

With IGV, you have many options to display your data. Please see the IGV's User Guide for more information.

 

View the annotated peaks file in Excel and find all target genes

This tutorial can also be applied to other text files (e.g. gene ontology output files) from the BxChIPSeq output.

To learn more about the columns for the annotated peak file, see the Peak Report for details.

Step 1. Save the text file to your local drive. You can do this by right click (or control click in Mac) the Annotated Peak File link and choose "save link as" or "save target as". Remember where you saved the file.

Step 2. Start Excel, from Open menu, go to the folder where you saved the peak file, make sure to choose all files (*.*) for file type, and open the text file (e.g. TCF7L2A_peaks.annotated.txt).

Import Peak Report Text File to Excel

Step 3. Turn on Auto filter in Excel.

Now you have the annotated peak file opened in Excel, let's try to make it easier to use. Turn on auto filter to access many useful features quickly.

To turn on autofilter, select all the cells, and hit Ctrl-Shift-L.

Turn on AutoFilter in Excel

Step 4. Find all target genes whose promoters are occupied by the factor

With auto filter, you can do a lot of sorting and filtering in Excel. Here we will show you how to quickly identify all the target genes for a factor.

In the peak report file, if a peak falls near a promoter of a gene, it is listed in the annotation field. To filter for peaks that fall into promoters, do the following for column H.

Create Custom AutoFilter to Select Promoters

Create Custom AutoFilter to Select Promoters

Enter promoter in the filter for annotations, and Excel will now only display genes whose promoters are occupied by the factor.

All Genes with Peaks at Promoters

Demo Data

 

In order to access the demo data, please register for a free account.

If you already have a BioInfoRx account, please click here to sign in.

 

TCF7L2 Transcription Factor ChIP (Illumina Sequencing)

 

Conditions

Species: Human (hg19)
Notes: ChIP-Seq experiment data for TCF7L2 transcription factor in HepG2 cell line from NIH SRA Database

Results

Experiment Results Control ChIPSeq Name
TCF7L2A View Input TCF7L2A
TCF7L2B View Input TCF7L2B
Input View N/A Input
 

Histone H3K4me3 modification in mouse brain (SOLiD Sequencing)

 

Conditions

Species: Mouse (mm9)
Notes: ChIP-Seq experiment data H3K4me3 modification of mouse brain from NIH SRA Database

Results

Experiment Results Control ChIPSeq Name
H3K4me3_Brain View N/A H3K4me3_Brain

Frequently Asked Questions

 

Q. How does the service work?

A. It's very simple. After we receive your order, we will give you instructions to ftp your raw sequence data to us. Once we receive the data, we will build a secure website containing all analysis results from your data within a week. You will receive a link and password to view your data.

Q. What data format do you need for the raw data?

A. Typically researchers send us the raw fastq files. We can also take sequence read archive data format (.sra), or aligned SAM files from program like Bowtie.

Q. How long do you host my data on the website?

A. One year. After one year, you can pay a nominal fee to keep the data on the website, which is convenient if you often use the link to load custom track to UCSC genome browser.

Q. Can I download all the data from the website?

A. Absolutely. You can download all the data to your local drive. The advantage of the website is that you can access it from any computer, anywhere with internet connection.

Q. Can I compare between two ChIP experiments? I have drug treated and untreated ChIP samples, and I want to see what peaks are induced by the drug.

A. Yes, you can simply create a new analysis in the Sample Submission Form, with drug treated as ChIP run, and untreated ChIP data as control run.

Q. Is the website secure? Can I share the data with my collaborators?

A. Only you have the URL and passwords to your data webpage. You can share the user name and password with your team members or collaborators. If you really want to delete the data on the website, please inform us after you've downloaded a local copy.

Q. Can I get more technology and method background of BxChIPSeq 2.0 service?

A. To learn more about some of the tools and technologies related to BxChIPSeq 2.0, please check these references.

1. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol.2009;10(3):R25. Epub 2009 Mar 4. PubMed PMID: 19261174; PubMed Central PMCID:PMC2690996.

Bowtie is a fast tool to align sequence reads to the genome.

2. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D.The human genome browser at UCSC. Genome Res. 2002 Jun;12(6):996-1006. PubMed PMID: 12045153; PubMed Central PMCID: PMC186604.

UCSC Genome Browser is the most popular tool to view most genomes.

3. Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, Cheng JX, Murre C,Singh H, Glass CK. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010 May 28;38(4):576-89. PubMed PMID: 20513432; PubMed Central PMCID: PMC2898526.

There are many tools for peak finding in ChIP-Seq data. Homer stands out as it provides one of the most comprehensive feature set and detailed annotations of the peaks and genes.

4. Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nat Biotechnol. 2011 Jan;29(1):24-6.PubMed PMID: 21221095.

Integrative Genomics Viewer (IGV) is adopted by many researchers working with next-gen sequencing data due to its speed and capability to handle large data sets.

Contact Us

Thank you for your interest in BxChIPSeq 2.0 service. You can use the form below to ask questions. We will get back to you shortly.
Your Information

* indicates required information

First Name:*
Last Name:*
Email:*
Phone:*
Organization / Company*:
What's your question?
Note: Once we receive your message, we will contact you very soon. Please enter valid contact information.

Request Service

Thank you for your interest in BxChIPSeq 2.0 service. You can use the form below to order BxChIPSeq 2.0 service or ask questions. We will get back to you shortly.
Your Information

* indicates required information

First Name:*
Last Name:*
Email:*
Phone:*
Organization / Company*:
Department / Division:
Laboratory / Group:
Mailing Address:
About Your ChIP-Seq Experiment

Brief description of the experiment (antibody, tissue, etc. this will appear in the webpage displaying your results):

Genome of interest (human, mouse, rat etc.) We use the latest genome build at UCSC, e.g. hg19, mm9

Sample Details - Help

Cost: $199/analysis. Each line in the table is considered as one analysis, which can include two sequence runs (ChIP and control), or one sequence run (ChIP only).

Instructions

  • File name is the sequence raw data file name. We accept any major format, e.g., .txt, .fastq, .sra, .sam, etc. Compressed format is okay too (e.g., .gz, .gzip etc).
  • Display name is a human friendly name you want to see in the results.
  • We recommed running controls (like input) for ChIP-Seq experiments.
  • If you don't have control, leave those fields empty.
  • If you want to compare two ChIP experiments, you can put one ChIP run as control.

Example

ChIP Sequence Run (Experiment) Control Sequence Run (Background) Display Name for ChIP-Seq Data
File Name Display Name File Name Display Name
SRR340077.fastq TCF7L2A SRR353506_input.fastq Input TCF7L2A
SRR340078.fastq TCF7L2B SRR353506_input.fastq Input TCF7L2B
SRR340077.fastq TCF7L2A SRR340078.fastq TCF7L2B TCF7L2A_only_Peaks
SRR340077.fastq TCF7L2A     TCF7L2A_only_without_control
ChIP Sequence Run (Experiment) Control Sequence Run (Background) Display Name for ChIP-Seq Data Peak Style  
File Name Display Name File Name Display Name
 Remove
 Remove
 Remove
 Remove
 Remove

Add a new row - Help

Note: Once we receive your request, we will contact you very soon. Please enter valid contact information.