Annotate Cells Using Distance Matrix and Marker Frequencies

This function annotates cells based on a cell-gene distance matrix and marker gene frequencies. It computes the average distances, optionally calculates confidence levels for the predictions, and computes cell mixing proportions.

annotation_mat(
  distce,
  marker.freq,
  gene.use = NULL,
  cal.confidence = TRUE,
  cal.proportions = TRUE,
  parallel = TRUE,
  ncores = 10,
  n_fake = 1001,
  seed = 1,
  threshold = 0.95,
  unassign = "unassigned"
)

Arguments

distce: A matrix of distances between spots and genes. Rows represent genes, and columns represent cells. Generally, it is a list of the output of function ProFAST::pdistance with CAESAR co-embedding as input.
marker.freq: A matrix where rows represent cell types, and columns represent marker genes. The values in the matrix represent the frequency or weight of each marker gene for each cell type. Generally, it is a list of the output of function markerList2mat.
gene.use: A character vector specifying which genes to use for the annotation. If `NULL`, all genes in `distce` will be used. Default is `NULL`.
cal.confidence: Logical, indicating whether to calculate the confidence of the predictions. Default is `TRUE`.
cal.proportions: Logical, indicating whether to calculate the mixing proportions of cell types for each spot. Default is `TRUE`.
parallel: Logical, indicating whether to run the confidence calculation in parallel. Default is `TRUE`.
ncores: The number of cores to use for parallel computation. Default is 10.
n_fake: The number of fake (randomized) distance matrices to simulate for confidence calculation. Default is 1001.
seed: The random seed for reproducibility. Default is 1.
threshold: A numeric value specifying the confidence threshold below which a cell is labeled as `unassigned`. Default is 0.95.
unassign: A character string representing the label to assign to cells below the confidence threshold. Default is `"unassigned"`.

Value

A list with the following components:

ave.dist: A matrix of average distances between each cell and each cell type.
confidence: A numeric vector of confidence values for each cell (if `cal.confidence = TRUE`).
pred: A character vector of predicted cell types for each cell.
pred_unassign: A character vector of predicted cell types with cells below the confidence threshold labeled as `unassigned` (if `cal.confidence = TRUE`).
cell_mixing_proportions: A matrix of mixing proportions for each spot across the different cell types (if `cal.proportions = TRUE`).

Examples

data(toydata)

seu <- toydata$seu
markers <- toydata$markers

seu <- ProFAST::pdistance(seu, reduction = "caesar")
#> Calculate co-embedding distance...
distce <- Seurat::GetAssayData(object = seu, slot = "data", assay = "distce")

marker.freq <- markerList2mat(list(markers))

anno_res <- annotation_mat(distce, marker.freq, cal.confidence = FALSE, cal.proportions = FALSE)
str(anno_res)
#> List of 5
#>  $ ave.dist               : num [1:3000, 1:8] 28.4 24.9 27 25 34.2 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:3000] "24387" "4049" "11570" "25172" ...
#>   .. ..$ : chr [1:8] "CAFs" "Cancer Epithelial" "Endothelial" "Myeloid" ...
#>  $ confidence             : NULL
#>  $ pred                   : chr [1:3000] "Cancer Epithelial" "Cancer Epithelial" "Cancer Epithelial" "Cancer Epithelial" ...
#>  $ pred_unassign          : NULL
#>  $ cell_mixing_proportions: NULL

Annotate Cells Using Distance Matrix and Marker Frequencies

Arguments

Value

See also

Examples