普洛相关

Tool for the analysis and visualization of sample correlations based on the output of multiBamSummary or multiBigwigSummary. Pearson or Spearman methods are available to compute correlation coefficients. Results can be saved as multiple scatter plots depicting the pairwise correlations or as a clustered heatmap, where the colors represent the correlation coefficients and the clusters are constructed using complete linkage. Optionally, the values can be saved as tables, too.

detailed help:

plotCorrelation -h

usage: plotCorrelation -in matrix.gz -c spearman -p heatmap -o plot.png
help: plotCorrelation -h / plotCorrelation --help

Required arguments 

--corData, -in

Compressed matrix of values generated by multiBigwigSummary or multiBamSummary

--corMethod, -c

Possible choices: spearman, pearson

Correlation method.

--whatToPlot, -p

Possible choices: heatmap, scatterplot

Choose between a heatmap or pairwise scatter plots

Optional arguments 

--plotFile, -o

File to save the heatmap to. The file extension determines the format, so heatmap.pdf will save the heatmap in PDF format. The available formats are: .png, .eps, .pdf and .svg.

--skipZeros

By setting this option, genomic regions that have zero or missing (nan) values in all samples are excluded.

--labels, -l

User defined labels instead of default labels from file names. Multiple labels have to be separated by spaces, e.g. --labels sample1 sample2 sample3

--plotTitle, -T

Title of the plot, to be printed on top of the generated image. Leave blank for no title. (Default: )

--plotFileFormat

Possible choices: png, pdf, svg, eps, plotly

Image format type. If given, this option overrides the image format based on the plotFile ending. The available options are: png, eps, pdf and svg.

--removeOutliers

If set, bins with very large counts are removed. Bins with abnormally high reads counts artificially increase pearson correlation; that's why, multiBamSummary tries to remove outliers using the median absolute deviation (MAD) method applying a threshold of 200 to only consider extremely large deviations from the median. The ENCODE blacklist page (https://sites.google.com/site/anshulkundaje/projects/blacklists) contains useful information about regions with unusually high countsthat may be worth removing.

--version

show program's version number and exit

Output optional options 

--outFileCorMatrix: Save matrix with pairwise correlation values to a tab-separated file.

Heatmap options 

--plotHeight: Plot height in cm. (Default: 9.5)
--plotWidth: Plot width in cm. The minimum value is 1 cm. (Default: 11)
--zMin, -min: Minimum value for the heatmap intensities. If not specified, the value is set automatically
--zMax, -max: Maximum value for the heatmap intensities.If not specified, the value is set automatically
--colorMap: Color map to use for the heatmap. Available values can be seen here: http://matplotlib.org/examples/color/colormaps_reference.html
--plotNumbers: If set, then the correlation number is plotted on top of the heatmap. This option is only valid when plotting a heatmap.

Scatter plot options 

--xRange: The X axis range. The default scales these such that the full range of dots is displayed.
--yRange: The Y axis range. The default scales these such that the full range of dots is displayed.
--log1p: Plot the natural log of the scatter plot after adding 1. Note that this is ONLY for plotting, the correlation is unaffected.

example usages: plotCorrelation -in results_file --whatToPlot heatmap --corMethod pearson -o heatmap.png

在下面的示例中，根据计算的覆盖率文件执行相关性分析多BAM摘要或多大人物概要对于我们的测试，编码芯片序列数据集。

Scatterplot

在这里，我们用双色散点图计算每个转录的平均分数。多大人物概要并包括每个比较的皮尔逊相关系数。

$ deepTools2.0/bin/plotCorrelation \
-in scores_per_transcript.npz \
--corMethod pearson --skipZeros \
--plotTitle "Pearson Correlation of Average Scores Per Transcript" \
--whatToPlot scatterplot \
-o scatterplot_PearsonCorr_bigwigScores.png   \
--outFileCorMatrix PearsonCorr_bigwigScores.tab

../../_images/scatterplot_PearsonCorr_bigwigScores.png

$ cat PearsonCorr_bigwigScores.tab
    'H3K27me3'      'H3K4me1'       'H3K4me3'       'HeK9me3'       'input'
    'H3K27me3'      1.0000  -0.1032 -0.1269 -0.0339 -0.0395
    'H3K4me1'       -0.1032 1.0000  0.3985  -0.1863 0.3328
    'H3K4me3'       -0.1269 0.3985  1.0000  -0.0480 0.2822
    'HeK9me3'       -0.0339 -0.1863 -0.0480 1.0000  -0.0353
    'input' -0.0395 0.3328  0.2822  -0.0353 1.0000

Heatmap

除了散点图外，还可以生成热图，其中成对相关系数用不同的颜色强度表示，并使用层次聚类进行聚类。

这里的例子计算读取计数的斯皮尔曼相关系数。树形图显示哪些样本的读取计数彼此最相似。

$ deepTools2.0/bin/plotCorrelation \
    -in readCounts.npz \
    --corMethod spearman --skipZeros \
    --plotTitle "Spearman Correlation of Read Counts" \
    --whatToPlot heatmap --colorMap RdYlBu --plotNumbers \
    -o heatmap_SpearmanCorr_readCounts.png   \
    --outFileCorMatrix SpearmanCorr_readCounts.tab

../../_images/heatmap_SpearmanCorr_readCounts.png

deepTools Galaxy <http://deeptools.ie-freiburg.mpg.de> _.

code @ github <https://github.com/deeptools/deepTools/> _.

普洛相关

Required arguments 

Optional arguments 

Output optional options 

Heatmap options 

Scatter plot options 

背景 

相关计算 

层次聚类 

实例 