Text search

Search thousands of experiments by organism, technique, biological source, author…

Search queries are case insensitive.
Enclose a group of words in double quotes to search for an exact phrase (e.g. "embryonic stem cell").
Use the plus sign (+) to combine several independent searches (e.g. Human H3K4me3 MCF-7 + Human Cebpb K-562).
Filter by quality (e.g. AAA) or by a range of quality (e.g. "A-C" to retrieve datasets having a quality between AAA and CCC).

Results

Click on a row to select/unselect the experiment. Selected experiments are highlighted in green.
Click the button to display more information.

Examples: MCF-7 ESR1, mESC H3K27me3, GM12878 Hi-C.

Extended search

Displaying results 0 - 0 of 0 Charts


Accession	Organism	Biological source	Experiment	Mapped Reads	QC stamp	Cis PETs
No entries found

Genome browser

Visualize enrichment sites, read coverage, and chromatin organization in one place with our lightweight genome browser.

Enter a gene symbol or genomic coordinates, then click the Browse button to start exploring. Right-click on a track to display more information, update settings, etc.

Gene

Coordinates

User data

Upload files to visually explore and compare private data along public data.

Supported file formats:

1) For displaying enrichment coverage

bigWig: Genome coverage (i.e. read count per fixed-size genomic window)

2) For displaying a enrichment patterns in a colored-barplot format

BED: 5 column format.

3) For displaying annotated peaks

narrowPeak format

4) For long-range chromatin interactions

hic: Hi-C contact maps

For more information, please refer to the documentation.

Name	Type	Genome
No files added

Documentation

This manual presents NAVi, describes how to use the web application, and illustrates features with examples. We recommend that you read it carefully, and refer to it for future reference.

Last updated: 31 May 2018.

Getting started

NAVi is data portal allowing to search, and visualize publicly available high-throughput sequencing data sets. The raw data were collected from public repositories, uniformly processed, and annotated using controlled vocabularies.

Accessing QC Genomics

QC Genomics is available online at http://ngs-qc.org/qcgenomics. Alternatively, NAVi can be directly accessed at http://ngs-qc.org/navi.

Overall look

A tour presenting the search and visualization features can be started by clicking the "Start the tour" button on the main portal or by clicking the following link: take the tour.

Portal

Text search

Find experiments integrated in the NGS-QC, and LOGIQA databases using complex search queries.

Data visualization

Explore experiments of interest with our lightweight genome browser.

User-submitted data

Upload local files, and visualize unpublished private data amongst published data.

Text search

Head to the experiments tab to explore publicly available sequencing experiments. Using complex search queries, you can retrieve experiments integrated into NGS-QC (e.g. ChIP-seq, ATAC-seq) or LOGIQA (e.g. Hi-C, ChIA-PET).

The video below shows how to search for high-quality H3K27ac and FAIRE-seq data, order and filter retrieved experiments, and select experiments of interest.

H3K27ac is an epigenetic mark associated with active promoters and enhancers, and FAIRE-seq is a method for identifying open chromatin regions.
AAA is a quality stamp inferred by NGS-QC Generator, and indicates that the profile is of high quality.

Youtube video : Click here

Searchable terms

Accession: GEO identifier (e.g. GSM1239499 or GSE51142) or PubMed identifier (e.g. 23953112) if the experiment is part of a publication referenced in PubMed.
Organism: Scientific name (e.g. Homo sapiens), common name (e.g. human) or genome assembly (e.g. hg19).
See the supported organisms for more information.
Experiment: Method or protocol (e.g. DNAse-seq), or a transcription factor or histone mark for ChIP-seq data. For most transcription factors, the same results are returned whether you enter an official identifier (e.g. SOX9) or an alias (e.g. CMPD1).
Biological source: Tissue, cell type, or cell line.
Quality stamp: By evaluating the effect of random sampling on a given profile, NGS-QC Generator infers a quality score represented by a triple letter rating: from best (AAA) to worst (DDD). Unfortunately, sometimes there are too few data sets with an AAA quality. If so, you can extend the search to BBB quality data sets by searching A-B to retain all data sets having a quality between AAA and BBB.
Publication title, citation, or abstract: Any keyword from a publication's title (e.g. cohesin), citation (e.g. "Nature 2017"), or abstract (e.g. "Breast cancer"). Please note that searching abstracts requires more time, and may may significantly slow down your searches.
Author: Any author from a publication referenced in PubMed (e.g. Kellis).

Searching method

Case insensitive: Searching Human H3K4me3 or hUMaN h3k4ME3 returns the same results.
Phrase search: In order to search for an exact group of words, surround them by double quotes (e.g. "embryonic stem cell").
Combining searches: Searching Human Cebpb H3K9ac does not return any results because no experiment matches Cebpb and H3K9ac.
Use the plus (+) sign to combine multiple independent experiments: Human Cebpb + human H3K9ac in order to retrieve human experiments against the Cebpb transcription factor, and (human) experiments against the H3K9ac mark.

Data visualization

Please note that it is not possible to simultaneously visualize data from different organisms. If you open the Genome browser tab having selected experiments from more than one organism (e.g. Human and mouse), an error message will invite you to refine your selection.

To visualize selected experiments, open the Genome browser tab, and define a genomic region to display by entering a gene symbol or genomic coordinates. The following video shows how to integrate several experiments for visual exploration, and how to change some track settings.

Youtube video : Click here

The data sets used as examples in this section are listed below.

Organism	Biological source	Experiment	Source
Homo sapiens	MCF-7	POL2RA ChIA-PET	Li GL, et al. Cell. 2012
		FOXA1 ChIP-seq	Theodorou V, et al. Genome Res. 2013
		RNAPol2 ChIP-seq	wa Maina C, et al. PLoS Comput Biol. 2014
		H3K4me3 ChIP-seq	Yamamoto S, et al. Cancer Cell. 2014
		H3K9ac ChIP-seq	Grimmer MR, et al. Nucleic Acids Res. 2014
		FAIRE-seq	Hardy K, et al. Nucleus. 2016
		H3K27ac ChIP-seq	Rhie SK, et al. Epigenetics Chromatin. 2016
		HindIII Hi-C	Barutcu AR, et al. Genome Biol. 2015
	MCF-10A	HindIII Hi-C	Barutcu AR, et al. Genome Biol. 2015
	GM12878	HindIII Capture Hi-C	Cairns JC, et al. Genome Biol. 2016

Track types

Gene annotations

Gene annotations display genes at their respective position on the genome. The gene's sense strand is represented by less-than (<, minus strand) and greater-than (>, plus strand) symbols.

When zooming out, genes too small compared to the genomic region are not displayed. The figure below shows a zoomed out region of the previous figure: smallest genes are not represented anymore (e.g. HOXA7).

δRCI heatmap

NGS-QC Generator infers local quality indicators by evaluating the influence of random sampling on a given profile. Briefly, the genome is divided into defined windows of 500bp (referred as bins) and mapped reads are assigned to bins. Three random samplings are realised by retaining 90%, 70%, and 50% of the original reads, then the read count dispersion (δRCI) is calculated for each bin, where the dispersion represents the difference between the expected count and the observed one, expressed as a percentage. Bins with a dispersion lower than 10% for each sampling are retained.

This test is performed five times, to ensure its reproducibility. Finally, bins that have been retained at least N times (1 ≥N ≥ 5) are represented on a heatmap.

It is possible to increase or decrease N to change the stringency of the filtering (e.g. setting N = 5 shows only bins with a very strong signal).

Youtube video : Click here

Genome coverage

Genome coverage tracks represent signal data by calculating the read coverage that is the number of reads per bin, where bins are consecutive, fixed-size windows.

Youtube video : Click here

Coverage tracks are available for all 2D experiments from the NGS-QC collection. When visualizing a profile, only the δRCI heatmap is displayed by default. To load the genome coverage, right-click on the δRCI track, then select Coverage.

The scale of the Y-axis is defined according to be the greatest value within the displayed genomic region. If this value is particularly high, the rest of the signal might look like background. In order to better distinguish the profile, you can change the Y-axis maximum value by following these steps:

Right-click on the coverage track
select Configure
change the Y-axis scale mode from Automatic to Fixed
enter the desired value in the text field
click Apply or Apply to all to apply the Y-axis scale limit to the current track, or to all coverage tracks, respectively.

When analysing ChIP-seq, FAIRE-seq, and enrichment related assays, it is common to consider duplicate reads as PCR duplicates and remove them, as they could contribute to false positives during peak detection. However, comparing a profile with and without duplicates can help to evaluate the library complexity. By default, coverage tracks display the signal with duplicates, but it is possible to superimpose the signal with duplicates and the signal without by following these steps:

Right-click on the coverage track
select Configure
under Profile without PCR duplicates, check the Displayed box
click Apply to refresh the track

Peaks

Peak calling is a fundamental step in the analysis of ChIP-seq and related assays data, and aims to identify protein-DNA binding events.

MACS is a popular peak caller and is particularly appropriate for transcription factor data. ChIP-seq, and chromatin accessibility experiments from the NGS-QC collection have been processed with MACS2 and the peaks called are available for visualization. To load an experiment's peaks, right-click on the δRCI track, and select Peaks in the menu (if the item is missing, it means that peaks are not yet available for this experiment).

Youtube video : Click here

Peaks reported by MACS2 are not filtered by p-value or q-value. However it is possible to filter those using a p-value cutoff:

Right-click on the peak track
select Configure
enter a cutoff (e.g. 1e-30)
click Apply

Genome interactions

GSM1631185 chr3:58050674-69203112 zMax=15

Chromosome conformation capture data is represented as a heatmap where each element of the matrix corresponds to a genomic interaction, and is colored according to the contact frequency. To support zooming, four resolutions are available: 5kb, 25kb, 100kb, and 1Mb. When loading a contact map, the resolution is selected based on to the length of the current genomic region, and can be displayed by hovering the pointer over the track's name.

Flipping track

When comparing two experiments, it might be useful to have the maps facing each other. You can flip vertically a track by right-clicking it, then selecting Flip.

Youtube video : Click here

Changing scale

By default, the 90th percentile is used as maximum value for the heat map color scale. This value can be manually defined: right-click on a track, select Configure, change the Scale limit to Fixed, enter the new value, then click .

Youtube video : Click here

Heat maps are convenient to explore the overall structure of the chromatin, but they become limited for local visualization, as they display all data. NAVi offers the possibility to display genomic interactions as arcs between two loci. Moreover, it is possible to filter loops by contact frequency or to display only those having at least on end within a given interval.

Displaying loops as arcs

Right-click on a track, select Configure, change the display mode to Loops, select a resolution, define a contact count threshold and a genomic interval (optional), then click Apply.

Youtube video : Click here

The and buttons in the top left corner, allows you to condense and expand tracks horizontally.

You can use the buttons above the gene annotation track to navigate upstream or downstream, and to zoom in or out.

You may also navigate upstream or downstream by clicking on any track, hold, drag to the desired genomic position, then release.

To zoom in to a precise location, click on the gene track or the genome coordinate axis, hold, drag to the desired position, then release.

Youtube video : Click here

A session can be shared with other investigators by saving currently visualized experiments, settings, and genomic coordinates. Click the button to generate a shareable link: anyone with this link can load the session.

Changes made after loading a session, such as adding a track or changing the coordinates, are not saved. If you want to save these changes, click again the button to generate another session.

Upload files

Because visual inspection of biological data is important, NAVi allows investigators to explore their data along publicly available experiments.

BigWig

The BigWig format is a binary, compressed, and indexed version of the wiggle format. It is designed to display dense, continuous data (e.g. genome coverage), and supports zooming by storing data at different resolutions.

Click here for more information on the bigWig format.

HiC

The hic format is binary, compressed, and indexed file format designed to store Hi-C contact maps at multiple resolutions.

Click here for more information on the hic format.

NarrowPeaks

The narrowPeak format is used to store called peaks of signal enrichment. When calling peaks with MACS2, a narrowPeak file is produced, unless the --broad flag is on.

Click here for more information on the narrowPeak format.

Colored BED

The colored BED format is a BED5 file format used to store labelled genomic features, such as chromatin segmentation states. The five columns are:

chrom: the name of the chromosome
start: the starting position of the feature
end: the ending position of the feature
label: the label of the feature (e.g. "Active promoter"), displayed when hovering the feature
color: the color associated to the feature, in the RBG (e.g. 22,160,133) or HEX (e.g. #D84315) color code.

Pairwise interactions

Long-range genome interactions can be loaded from BED5 files. The five columns are:

chrom: the name of the chromosome
pos1: the leftmost position
pos2: the rightmost position
label: not used, can be left empty.
score: contact frequency, or confidence estimation.

Supported organisms

The table below contains the reference genomes used to process publicly available data.

Organism	Common name	Genome assembly
Arabidopsis thaliana	Arabidopsis	TAIR10
Caenorhabditis elegans	Worm	ce10
Danio rerio	Zebrafish	danRer7
Drosophila melanogaster	Fruitfly	dm3
Gallus gallus	Chicken	galGal4
Homo sapiens	Human	hg19
Mus musculus	Mouse	mm9
Pan troglodytes	Chimpanzee	panTro4
Rattus norvegicus	Rat	rn5
Saccharomyces cerevisiae	Baker's yeast	sacCer3

Contributors

Matthias Blum

Data processing – NAVi development

Pierre-Etienne Cholley

Data curation – Comparator/ChromStater main development

Anissa Djedid

Comparator/ChromStater initial development

Samuel Nicaise

Comparator/ChromStater additional development

Marco Antonio Mendoza-Parra, PhD

NGS-QC Generator concept and development

Hinrich Gronemeyer, PhD

Principal investigator

Text search

Genome browser

User data

Documentation

Getting started

Accessing QC Genomics

Overall look

Portal

Text search

Data visualization

User-submitted data

Text search

Searchable terms

Searching method

Data visualization

Track types

Gene annotations

δRCI heatmap

Genome coverage

Peaks

Genome interactions

Flipping track

Changing scale

Displaying loops as arcs

Navigation

Upload files

BigWig

HiC

NarrowPeaks

Colored BED

Pairwise interactions

Supported organisms

Contributors

Experiment details

Track details

Text search

Genome browser

User data

Documentation

Getting started

Accessing QC Genomics

Overall look

Portal

Text search

Data visualization

User-submitted data

Text search

Searchable terms

Searching method

Data visualization

Track types

Gene annotations

δRCI heatmap

Genome coverage

Peaks

Genome interactions

Flipping track

Changing scale

Displaying loops as arcs

Navigation

Sharing

Upload files

BigWig

HiC

NarrowPeaks

Colored BED

Pairwise interactions

Supported organisms

Contributors

Experiment details

Track details

Share with others