Bin a GRanges and allow to apply a summary method (e.g: 'mean', 'median', 'sum', 'max, 'min' ...) to a chossen numericals variables of ranges in a same bin.
Usage
BinGRanges(
gRange.gnr = NULL,
chromSize.dtf = NULL,
binSize.num = NULL,
method.chr = "mean",
variablesName.chr_vec = NULL,
na.rm = TRUE,
cores.num = 1,
reduce.bln = TRUE,
verbose.bln = FALSE
)
Arguments
- gRange.gnr
<GRanges>: A GRanges to bin.
- chromSize.dtf
<data.frame>: A data.frame where first colum correspond to the chromosomes names, and the second column correspond to the chromosomes lengths in base pairs.
- binSize.num
<numerical>: Width of the bins.
- method.chr
<character>: Name of a summary method as 'mean', 'median', 'sum', 'max, 'min'. (Default 'mean')
- variablesName.chr_vec
<character> : A character vector that specify the metadata columns of GRanges on which apply the summary method.
- na.rm
<logical> : A logical value indicating whether 'NA' values should be stripped before the computation proceeds. (Default TRUE)
- cores.num
<numerical> : The number of cores. (Default 1)
- reduce.bln
<logical> : Whether duplicated Bin must been reduced with de summary method. (Default TRUE)
- verbose.bln
<logical>: If TRUE show the progression in console. (Default FALSE)
Examples
GRange.gnr <- GenomicRanges::GRanges(
seqnames = S4Vectors::Rle(c("chr1", "chr2"), c(3, 1)),
ranges = IRanges::IRanges(
start = c(1, 201, 251, 1),
end = c(200, 250, 330, 100),
names = letters[seq_len(4)]
),
strand = S4Vectors::Rle(BiocGenerics::strand(c("*")), 4),
score = c(50, NA, 100, 30)
)
GRange.gnr
#> GRanges object with 4 ranges and 1 metadata column:
#> seqnames ranges strand | score
#> <Rle> <IRanges> <Rle> | <numeric>
#> a chr1 1-200 * | 50
#> b chr1 201-250 * | NA
#> c chr1 251-330 * | 100
#> d chr2 1-100 * | 30
#> -------
#> seqinfo: 2 sequences from an unspecified genome; no seqlengths
BinGRanges(
gRange.gnr = GRange.gnr,
chromSize.dtf = data.frame(c("chr1", "chr2"), c(350, 100)),
binSize.num = 100,
method.chr = "mean",
variablesName.chr_vec = "score",
na.rm = TRUE
)
#> GRanges object with 5 ranges and 2 metadata columns:
#> seqnames ranges strand | score bin
#> <Rle> <IRanges> <Rle> | <numeric> <character>
#> [1] chr1 1-100 * | 50 chr1:1
#> [2] chr1 101-200 * | 50 chr1:2
#> [3] chr1 201-300 * | 100 chr1:3
#> [4] chr1 301-350 * | 100 chr1:4
#> [5] chr2 1-100 * | 30 chr2:1
#> -------
#> seqinfo: 2 sequences from an unspecified genome