Performs copy number alteration (CNA) analysis on a muscadet
object by processing allelic and coverage counts across clusters and
evaluating cell fractions and copy numbers.
Usage
cnaCalling(
x,
omics.coverage = NULL,
depthmin.a.clusters = 30,
depthmin.c.clusters = 50,
depthmin.a.allcells = 30,
depthmin.c.allcells = 50,
depthmin.nor = 5,
depthmax.nor = 1000,
het.thresh = 0.25,
snp.nbhd = 250,
hetscale = TRUE,
cval1 = 25,
cval2 = 150,
min.nhet = 5,
clonal.thresh = 0.9,
dist.breakpoints = 1e+06,
ploidy = "auto",
quiet = FALSE
)
Source
This function uses several functions from the facets-package package,
including: facets::clustersegs()
, facets::emcncf()
,
facets::findDiploidLogR()
, facets::fitcncf()
, facets::procSample()
,
facets::procSnps()
, and adapted function preProcSample2()
.
Seshan VE, Shen R (2021). facets: Cellular Fraction and Copy Numbers from Tumor Sequencing. R package version 0.6.2, https://github.com/mskcc/facets.
Arguments
- x
A
muscadet
object. Must contain:Clustering assignments in the
cnacalling$clusters
slot (useassignClusters()
).Combined allelic and coverage counts per cluster in the
cnacalling$combined.counts
slot (usemergeCounts()
).
- omics.coverage
A vector of omics names to select for coverage log R ratio data. RECOMMENDED: select "ATAC" when ATAC and RNA omics are available, the ATAC coverage (DNA) signal is less noisy than RNA signal. By default,
NULL
selects all available data.- depthmin.a.clusters
Minimum allelic depth per clusters in tumor cells (default: 30).
- depthmin.c.clusters
Minimum coverage depth per clusters in tumor cells (default: 50).
- depthmin.a.allcells
Minimum allelic depth for all tumor cells (default: 30).
- depthmin.c.allcells
Minimum coverage depth for all tumor cells (default: 50).
- depthmin.nor
Minimum coverage depth for normal sample (default: 0).
- depthmax.nor
Maximum coverage depth for normal sample (default: 1000).
- het.thresh
VAF (Variant Allele Frequency) threshold to call variant positions heterozygous for
preProcSample2()
(default: 0.25).- snp.nbhd
Window size for selecting SNP loci to reduce serial correlation for
preProcSample2()
(default: 250).- hetscale
Logical value indicating whether log odds ratio (logOR) should be scaled to give more weight in the test statistics for segmentation and clustering
preProcSample2()
. (default:TRUE
)- cval1
Critical value for segmentation for
preProcSample2()
(default: 25).- cval2
Critical value for segmentation for
facets::procSample()
(default: 150).- min.nhet
Minimum number of heterozygous positions in a segment for
facets::procSample()
andfacets::emcncf()
(default: 5).- clonal.thresh
Threshold of minimum cell proportion to label a segment as clonal (default: 0.9).
- dist.breakpoints
Minimum distance between breakpoints to define distinct segments (default: 1e6).
- ploidy
Specifies ploidy assumption:
"auto"
,"median"
, or numeric value (default:"auto"
).- quiet
Logical. If
TRUE
, suppresses informative messages during execution. Default isFALSE
.
Value
A modified muscadet
object with added CNA analysis results in
the cnacalling
slot, including: filtered counts and positions, segmentation
data for clusters and all cells, consensus segments across clusters based on
breakpoints, diploid log R ratio, purity and ploidy.
Details of the cnacalling slot:
combined.counts.filtered
: Filtered counts per clusters.combined.counts.allcells
: Counts summed for all cells (no cluster distinction).combined.counts.allcells.filtered
: Filtered counts summed for all cells (no cluster distinction).positions
: Data frame of positions from the per cluster analysis. Positions in rows and associated data in columns:chrom
,maploc
(position),rCountT
(read count in tumor),rCountN
(read count in normal),vafT
(variant allele frequency in tumor),vafN
(variant allele frequency in normal),cluster
(cluster id),signal
(whether the counts come from coverage or allelic data),het
(heterozygous status),keep
(whether to keep position),gcpct
(GC percentage),gcbias
(GC bias correction),cnlr
(log R ratio),valor
(log odds ratio),lorvar
(variance of log odds ratio),seg0
,seg_ori
(segment original id within each cluster),seg
(segment id),segclust
(cluster of segments id),vafT.allcells
(vairiant allele frequency in all tumor cells),colVAR
(integer for allelic position color depending onvafT.allcells
).segments
: Data frame of segments from the per cluster analysis. Segments in rows and associated data in columns:chrom
,seg
(segment id),num.mark
(number of positions in segment),nhet
(number of heterezygous positions in segment),cnlr.median
(segment log R ratio median),mafR
(segment square of expected log odds ratio), vafT.median (segment variant allele frequency median),cluster
(cluster id),seg_ori
(segment original id within each cluster),segclust
(cluster of segments id),cnlr.median.clust
(segment cluster log R ratio median),mafR.clust
(segment cluster square of expected log odds ratio),cf
(cell fraction),tcn
(total copy number),lcn
(lower copy number),start
,end
,cf.em
(cell fraction computed with EM algorithm),tcn.em
, (total copy number computed with EM algorithm),lcn.em
(lower copy number computed with EM algorithm).positions.allcells
: Same aspositions
but from the all cells analysis.segments.allcells
: Same assegments
but from the all cells analysis.consensus.segs
: Data frame of unique consensus segments across clusters, with thecna
(logical
) andcna_clonal
(logical
) information.table
: Data frame of consensus segments across clusters with associated information per cluster in columns:chrom
,start
,end
,id
,cluster
,cf.em
(cell fraction computed with EM algorithm),tcn.em
(total copy number computed with EM algorithm),lcn.em
(lower copy number computed with EM algorithm),ncells
(number of cells in cluster),prop.cluster
(proportion of cells per cluster),gnl
(gain;neutral;loss : 1;0;-1),loh
(loss of heterozygosity status),state
(state of segments),cna
(whether the segment is a CNA),cna_state
(state of CNA segments),prop.tot
(proportion of cells with the same state per segment),state_clonal
(state of the segment if itsprop.tot
is aboveclonal.thresh
),cna_clonal
(whether the segment is a clonal CNA),cna_clonal_state
(state of clonal CNA segments).ncells
: Vector of number of cells per cluster.dipLogR.clusters
: Diploid log R ratio from the per cluster analysis.dipLogR.allcells
: Diploid log R ratio from the all cells analysis.purity.clusters
: Purity from the per cluster analysis.purity.allcells
: Purity from the all cells analysis.ploidy.clusters
: Ploidy from the per cluster analysis.ploidy.allcells
: Ploidy from the all cells analysis.
References
Shen R, Seshan VE. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 2016 Sep 19;44(16):e131. doi: 10.1093/nar/gkw520.
Examples
library("facets")
#> Loading required package: pctGCdata
# Load example muscadet object
data(muscadet_obj)
muscadet_obj <- cnaCalling(muscadet_obj,
omics.coverage = "ATAC",
depthmin.a.clusters = 3, # set low thresholds for example data
depthmin.c.clusters = 5,
depthmin.a.allcells = 3,
depthmin.c.allcells = 5,
depthmin.nor = 0)
#> Selecting coverage data from omic(s): ATAC
#> Filtering positions per clusters based on provided filters...
#> Performing segmentation per cluster...
#> Finding diploid log R ratio on clusters...
#> Diploid log R ratio = -0.566520511224907
#> Computing cell fractions and copy numbers on clusters...
#> Filtering positions on all cells based on provided filters...
#> Performing segmentation on all cells...
#> Computing cell fractions and copy numbers on all cells...
#> Finding consensus segments between clusters...