R/caesar_annotation.R
annotation_mat.Rd
This function annotates cells based on a cell-gene distance matrix and marker gene frequencies. It computes the average distances, optionally calculates confidence levels for the predictions, and computes cell mixing proportions.
annotation_mat(
distce,
marker.freq,
gene.use = NULL,
cal.confidence = TRUE,
cal.proportions = TRUE,
parallel = TRUE,
ncores = 10,
n_fake = 1001,
seed = 1,
threshold = 0.95,
unassign = "unassigned"
)
A matrix of distances between spots and genes. Rows represent genes, and columns represent cells. Generally, it is a list of the output of function ProFAST::pdistance
with CAESAR co-embedding as input.
A matrix where rows represent cell types, and columns represent marker genes. The values in the matrix represent the frequency or weight of each marker gene for each cell type. Generally, it is a list of the output of function markerList2mat
.
A character vector specifying which genes to use for the annotation. If `NULL`, all genes in `distce` will be used. Default is `NULL`.
Logical, indicating whether to calculate the confidence of the predictions. Default is `TRUE`.
Logical, indicating whether to calculate the mixing proportions of cell types for each spot. Default is `TRUE`.
Logical, indicating whether to run the confidence calculation in parallel. Default is `TRUE`.
The number of cores to use for parallel computation. Default is 10.
The number of fake (randomized) distance matrices to simulate for confidence calculation. Default is 1001.
The random seed for reproducibility. Default is 1.
A numeric value specifying the confidence threshold below which a cell is labeled as `unassigned`. Default is 0.95.
A character string representing the label to assign to cells below the confidence threshold. Default is `"unassigned"`.
A list with the following components:
A matrix of average distances between each cell and each cell type.
A numeric vector of confidence values for each cell (if `cal.confidence = TRUE`).
A character vector of predicted cell types for each cell.
A character vector of predicted cell types with cells below the confidence threshold labeled as `unassigned` (if `cal.confidence = TRUE`).
A matrix of mixing proportions for each spot across the different cell types (if `cal.proportions = TRUE`).
marker.select
for select markers.
find.sig.genes
for signature gene list.
markerList2mat
for marker frequency matrix.
pdistance
for obtain cell-gene distance matrix using co-embedding.
data(toydata)
seu <- toydata$seu
markers <- toydata$markers
seu <- ProFAST::pdistance(seu, reduction = "caesar")
#> Calculate co-embedding distance...
distce <- Seurat::GetAssayData(object = seu, slot = "data", assay = "distce")
marker.freq <- markerList2mat(list(markers))
anno_res <- annotation_mat(distce, marker.freq, cal.confidence = FALSE, cal.proportions = FALSE)
str(anno_res)
#> List of 5
#> $ ave.dist : num [1:3000, 1:8] 28.4 24.9 27 25 34.2 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:3000] "24387" "4049" "11570" "25172" ...
#> .. ..$ : chr [1:8] "CAFs" "Cancer Epithelial" "Endothelial" "Myeloid" ...
#> $ confidence : NULL
#> $ pred : chr [1:3000] "Cancer Epithelial" "Cancer Epithelial" "Cancer Epithelial" "Cancer Epithelial" ...
#> $ pred_unassign : NULL
#> $ cell_mixing_proportions: NULL