Multi Omics Clustering using Seurat Multi Modal Graph-based Clustering

Performs graph-based clustering of cells using Seurat-package, based on one or two log R ratio matrices (mat_list), including shared nearest neighbors (SNN) graph construction on selected dimensions from PCA (dims_list), to identify clusters of cells for each specified resolution (res_range).

Usage

cluster_seurat(
  mat_list,
  res_range = seq(0.1, 0.5, 0.1),
  dims_list = rep(list(1:8), length(mat_list)),
  algorithm = 1,
  knn_seurat = 20,
  knn_range_seurat = 200,
  max_dim = 200,
  quiet = FALSE
)

Arguments

mat_list: A named list of log R ratio matrices (cells x features), one per omic layer (list).
res_range: A numeric non-negative vector specifying the resolution values to use for Seurat::FindClusters() (numeric vector). Default is c(0.1, 0.2, 0.3, 0.4, 0.5).
dims_list: A list of vectors of PC dimensions to use for each omic (list). Must match the length of mat_list (e.g., list(1:8) for 1 omic ; list(1:8, 1:8) for 2 omics). Default is the first 8 dimensions for each provided omic.
algorithm: Integer specifying the algorithm for modularity optimization by Seurat::FindClusters (1 = original Louvain algorithm; 2 = Louvain algorithm with multilevel refinement; 3 = SLM algorithm; 4 = Leiden algorithm). Leiden requires the leidenalg python. Default is 1.
knn_seurat: Integer specifying the number of nearest neighbors used for graph construction with Seurat-package functions Seurat::FindNeighbors() (k.param) or Seurat::FindMultiModalNeighbors() (k.nn) (integer). Default is 20.
knn_range_seurat: Integer specifying the approximate number of nearest neighbors to compute for Seurat::FindMultiModalNeighbors() (knn.range) (integer). Default is 200.
max_dim: Integer specifying the maximum number of principal components to be used for PCA computation with stats::prcomp() (integer). Default is 200.
quiet: Logical. If TRUE, suppresses informative messages during execution. Default is FALSE.

Value

A list containing:

params: List of parameters used for clustering (list).
pcs: List of principal components summaries for each omic (list of summary.prcomp stats::prcomp).
nn: Nearest neighbors object (Neighbor SeuratObject::Neighbor).
graph: Shared nearest neighbors graph (Graph SeuratObject::Graph).
dist: Distance matrix derived from the graph (matrix).
umap: UMAP coordinates (matrix).
clusters: A named list of clustering results (vectors of cluster labels) for each value in res_range (list).

Details

For two omics: multimodal integration is performed using Seurat::FindMultiModalNeighbors() (weighted shared nearest neighbors graph). Only common cells between omics are used.
For a single omic: Seurat::FindNeighbors() (shared nearest neighbors graph) is used.

Examples

if (FALSE) { # \dontrun{
# Load example muscadet object
data(muscadet_obj)

# Format input
# transpose matrices to: cells x features matrices
mat_list <- lapply(muscadet::matLogRatio(muscadet_obj), t)

# Run integration & clustering
result <- cluster_seurat(mat_list, res_range = c(0.1, 0.3, 0.5))

# View results
lapply(result$clusters, table)
} # }