对齐筛选
This tool filters alignments in a BAM/CRAM file according the the specified parameters. It can optionally output to BEDPE format.
usage: alignmentSieve -b sample1.bam -o sample1.filtered.bam --minMappingQuality 10 --filterMetrics log.txt
help: alignmentSieve -h / alignmentSieve --help
Required arguments
- --bam, -b
An indexed BAM file.
- --outFile, -o
The file to write results to. These are the alignments or fragments that pass the filtering criteria.
General arguments
- --numberOfProcessors, -p
Number of processors to use. Type "max/2" to use half the maximum number of processors or "max" to use all available processors. (Default: 1)
- --filterMetrics
The number of entries in total and filtered are saved to this file
- --filteredOutReads
If desired, all reads NOT passing the filtering criteria can be written to this file.
- --label, -l
User defined label instead of the default label (file name).
- --smartLabels
Instead of manually specifying a labels for the input file, this causes deepTools to use the file name after removing the path and extension.
- --verbose, -v
Set to see processing messages.
- --version
show program's version number and exit
- --shift
Shift the left and right end of a read (for BAM files) or a fragment (for BED files). A positive value shift an end to the right (on the + strand) and a negative value shifts a fragment to the left. Either 2 or 4 integers can be provided. For example, "2 -3" will shift the left-most fragment end two bases to the right and the right-most end 3 bases to the left. If 4 integers are provided, then the first and last two refer to fragments whose read 1 is on the left or right, respectively. Consequently, it is possible to take strand into consideration for strand-specific protocols. A fragment whose length falls below 1 due to shifting will not be written to the output. See the online documentation for graphical examples. Note that non-properly-paired reads will be filtered.
- --ATACshift
Shift the produced BAM file or BEDPE regions as commonly done for ATAC-seq. This is equivalent to --shift 4 -5 5 -4.
- --genomeChunkLength
Size of the genome (in bps) to be processed per thread. (Default: 1000000)
Output arguments
- --BED
Instead of producing BAM files, write output in BEDPE format (as defined by MACS2). Note that only reads/fragments passing filtering criterion are written in BEDPE format.
Optional arguments
- --filterRNAstrand
Possible choices: forward, reverse
Selects RNA-seq reads (single-end or paired-end) in the given strand. (Default: None)
- --ignoreDuplicates
If set, reads that have the same orientation and start position will be considered only once. If reads are paired, the mate's position also has to coincide to ignore a read.
- --minMappingQuality
If set, only reads that have a mapping quality score of at least this are considered.
- --samFlagInclude
Include reads based on the SAM flag. For example, to get only reads that are the first mate, use a flag of 64. This is useful to count properly paired reads only once, as otherwise the second mate will be also considered for the coverage.
- --samFlagExclude
Exclude reads based on the SAM flag. For example, to get only reads that map to the forward strand, use --samFlagExclude 16, where 16 is the SAM flag for reads that map to the reverse strand.
- --blackListFileName, -bl
A BED or GTF file containing regions that should be excluded from all analyses. Currently this works by rejecting genomic chunks that happen to overlap an entry. Consequently, for BAM files, if a read partially overlaps a blacklisted region or a fragment spans over it, then the read/fragment might still be considered. Please note that you should adjust the effective genome size, if relevant.
- --minFragmentLength
The minimum fragment length needed for read/pair inclusion. This option is primarily useful in ATACseq experiments, for filtering mono- or di-nucleosome fragments. (Default: 0)
- --maxFragmentLength
The maximum fragment length needed for read/pair inclusion. A value of 0 indicates no limit. (Default: 0)
背景
此工具根据指定的参数筛选BAM/CRAM文件中的对齐。它可以有选择地输出到bedpe格式,可能是以自定义方式移动片段的结尾。
使用实例
alignmentSieve
需要已排序和索引的BAM文件以及所需的筛选条件。
$ alignmentSieve -b paired_chr2L.bam \
--minMappingQuality 5 --samFlagInclude 16 \
--samFlagExclude 256 --ignoreDuplicates \
-o filtered.bam --filterMetrics metrics.txt
然后,通过筛选条件的路线将写入由指定的文件 -o
. 您还可以保存路线 NOT 通过筛选条件 -filteredOutReads
如果要存储有关看到的读取次数和筛选后剩余次数的度量,请使用 --filterMetrics
. 度量文件示例如下:
#bamfilterreads--filtermetrics文件读取剩余的初始读取总数paired_chr2l.bam 8440 12644
可以生成一个bedpe文件(适合输入到macs2),而不是一个bam文件。与BAM/CRAM输出一样,BEDPE也允许片段端的移动,这在ATAC Seq和相关协议中通常是可取的:
$ alignmentSieve -b paired_chr2L.bam \
--minFragmentLength 140 --BED \
--shift -5 3 -o fragments.bedpe
这个 --shift
选项可以取2或4个整数。如果给定两个整数,则第一个值移动片段的最左端,第二个值移动片段的最右端。正值向右移动,负值向左移动。有关上述设置如何移动单个片段,请参见下面的内容:
----> read 1
read 2 <----
------------------------ fragment
-------------------------------- shifted fragment
如果交换读取1和读取2,将产生相同的结果。相反,如果协议是特定于链的,那么一对中的第一组整数将应用于读1先于读2的片段,而第二组整数将应用于读2先于读1的片段。在这种情况下,每对中的第一个值应用于读取1的结尾,第二个值应用于读取2的结尾。以下面的命令为例:
$ alignmentSieve -b paired_chr2L.bam \
--minFragmentLength 140 --BED \
--shift -5 3 -1 4 -o fragments.bedpe
鉴于此, -5 3
集合将生成以下内容:
----> read 1
read 2 <----
------------------------ fragment
-------------------------------- shifted fragment
以及 -1 4
集合将生成以下内容:
----> read 2
read 1 <----
------------------------ fragment
--------------------- shifted fragment
可以看出,这些碎片被认为是 -
然后,负值在其参照系上向左移动(因此,相对于 +
股)。
备注
如果 --shift
或 --ATACshift
使用选项,则只使用正确配对的读取。
deepTools Galaxy <http://deeptools.ie-freiburg.mpg.de> _. |
code @ github <https://github.com/deeptools/deepTools/> _. |