Create a muscomic object containing coverage and
(optionally) allelic count information for a single-cell omic dataset.
Usage
CreateMuscomicObject(
type = c("ATAC", "RNA"),
mat_counts,
features,
allele_counts = NULL,
label.omic = NULL,
label.features = NULL
)Arguments
- type
Type of single cell omic, either "ATAC" or "RNA" (
characterstring).- mat_counts
Matrix of raw counts cells x features (
matrixordgCMatrix). Rows are cells and columns are features. (they must correspond to the id column offeatures)- features
Data frame of features (peaks, genes...) coordinates on genome (
data.frame). It should contain 4 columns:CHROMChromosome names in character format, e.g. "15", "X" (
character).startStart positions (
integer).endEnd positions (
integer).idUnique identifiers, e.g. gene name "CDH1" or peak identifier CHROM_start_end "1_1600338_1600838" (
character). It should match the feature identifiers as row names ofmat_counts.
- allele_counts
Data frame of allele counts at variant positions per cell (
data.frame). Variant positions can be either common single nucleotide polymorphisms (SNPs) positions or individual-specific heterozygous positions retrieved from bulk sequencing. The data frame format is based on the Variant Calling Format (VCF), thereby it must contain the following columns :cell,CHROM,POS,REF,ALT,RD,AD. Seeexdata_allele_counts()for details.- label.omic
Label for the single cell omic (
characterstring). By default "scATAC-seq" is used for "ATAC" type and "scRNA-seq" for "RNA" type.- label.features
Label for features (
characterstring). By default "peaks" is used for "ATAC" type and "genes" for "RNA" type.
Value
A muscomic object.
Examples
# Minimal example
mat <- Matrix::sparseMatrix(
i = c(1, 1, 2),
j = c(1, 2, 1),
x = c(5, 3, 2),
dims = c(2, 2),
dimnames = list(c("cell1", "cell2"), c("f1", "f2"))
)
features <- data.frame(
CHROM = c("1", "1"),
start = c(100, 200),
end = c(150, 250),
id = c("f1", "f2")
)
allele <- data.frame(
cell = c("cell1", "cell2"),
CHROM = c("1", "1"),
POS = c(100, 200),
REF = c("A", "G"),
ALT = c("C", "T"),
RD = c(10, 5),
AD = c(3, 1)
)
muscomic <- CreateMuscomicObject(
type = "ATAC",
mat_counts = mat,
features = features,
allele_counts = allele
)
# On the example dataset
atac <- CreateMuscomicObject(
type = "ATAC",
mat_counts = exdata_mat_counts_atac_tumor,
allele_counts = exdata_allele_counts_atac_tumor,
features = exdata_peaks
)
atac
#> A muscomic object
#> type: ATAC
#> label: scATAC-seq
#> cells: 71
#> counts: 71 cells x 1200 features (peaks)
#> logratio: None
#> variant positions: 681
# without allele counts data (not required for clustering step)
atac2 <- CreateMuscomicObject(
type = "ATAC",
mat_counts = exdata_mat_counts_atac_tumor,
features = exdata_peaks
)
atac2
#> A muscomic object
#> type: ATAC
#> label: scATAC-seq
#> cells: 71
#> counts: 71 cells x 1200 features (peaks)
#> logratio: None
#> variant positions: 0
