This function identifies signature genes for each cell type or cell group in a Seurat object using a co-embedding distance-based approach. It computes the average expression and distance metrics for each gene across different groups, while also considering expression proportions.
find.sig.genes(
seu,
distce.assay = "distce",
ident = NULL,
expr.prop.cutoff = 0.1,
assay = NULL,
genes.use = NULL
)
A Seurat object containing gene expression data.
A character string specifying the assay that contains the distance matrix or distance-related data. Default is "distce".
A character string specifying the column name in the `meta.data` slot of the Seurat object used to define the identities (clusters or cell groups). If `NULL`, the default identities (`Idents(seu)`) will be used. Default is `NULL`.
A numeric value specifying the minimum proportion of cells that must express a gene for it to be considered. Default is 0.1.
A character string specifying the assay to use for expression data. If `NULL`, the default assay of the Seurat object will be used. Default is `NULL`.
A character vector specifying the genes to use for the analysis. If `NULL`, all genes in the `distce.assay` assay will be used. Default is `NULL`.
A list where each element corresponds to a cell group and contains a data frame with the following columns:
The mean distance of the gene across the cells in the group.
The proportion of cells in the group expressing the gene.
The proportion of cells in other groups expressing the gene.
The identity label of the cell group.
The gene name.
None
data(toydata)
seu <- toydata$seu
seu <- ProFAST::pdistance(seu, reduction = "caesar")
#> Calculate co-embedding distance...
sglist <- find.sig.genes(
seu = seu
)
str(sglist)
#> List of 8
#> $ CAFs :'data.frame': 302 obs. of 5 variables:
#> ..$ distance : num [1:302] 23.5 23.5 23.6 23.6 23.6 ...
#> ..$ expr.prop : num [1:302] 0.551 0.644 0.575 0.57 0.302 ...
#> ..$ expr.prop.others: num [1:302] 0.0693 0.1257 0.1257 0.0819 0.051 ...
#> ..$ label : chr [1:302] "CAFs" "CAFs" "CAFs" "CAFs" ...
#> ..$ gene : chr [1:302] "FBLN1" "CCDC80" "PDGFRA" "PTGDS" ...
#> $ Cancer Epithelial:'data.frame': 302 obs. of 5 variables:
#> ..$ distance : num [1:302] 16.8 16.8 16.8 16.8 16.8 ...
#> ..$ expr.prop : num [1:302] 0.352 0.883 0.903 0.956 0.936 ...
#> ..$ expr.prop.others: num [1:302] 0.0674 0.2273 0.2422 0.3245 0.2845 ...
#> ..$ label : chr [1:302] "Cancer Epithelial" "Cancer Epithelial" "Cancer Epithelial" "Cancer Epithelial" ...
#> ..$ gene : chr [1:302] "LYPD3" "MLPH" "AGR3" "ESR1" ...
#> $ Endothelial :'data.frame': 302 obs. of 5 variables:
#> ..$ distance : num [1:302] 22.9 23 23.1 23.1 23.1 ...
#> ..$ expr.prop : num [1:302] 0.476 0.306 0.755 0.271 0.537 ...
#> ..$ expr.prop.others: num [1:302] 0.01552 0.00902 0.04583 0.01732 0.04006 ...
#> ..$ label : chr [1:302] "Endothelial" "Endothelial" "Endothelial" "Endothelial" ...
#> ..$ gene : chr [1:302] "HOXD9" "EGFL7" "VWF" "SOX17" ...
#> $ Myeloid :'data.frame': 302 obs. of 5 variables:
#> ..$ distance : num [1:302] 27.6 27.7 27.8 27.9 27.9 ...
#> ..$ expr.prop : num [1:302] 0.567 0.525 0.571 0.882 0.542 ...
#> ..$ expr.prop.others: num [1:302] 0.0442 0.0453 0.0206 0.122 0.0529 ...
#> ..$ label : chr [1:302] "Myeloid" "Myeloid" "Myeloid" "Myeloid" ...
#> ..$ gene : chr [1:302] "HAVCR2" "IGSF6" "ITGAX" "FCER1G" ...
#> $ Normal Epithelial:'data.frame': 302 obs. of 5 variables:
#> ..$ distance : num [1:302] 22.8 23 23.1 23.1 23.1 ...
#> ..$ expr.prop : num [1:302] 0.635 0.368 0.401 0.153 0.554 ...
#> ..$ expr.prop.others: num [1:302] 0.1389 0.042 0.0813 0.0219 0.0969 ...
#> ..$ label : chr [1:302] "Normal Epithelial" "Normal Epithelial" "Normal Epithelial" "Normal Epithelial" ...
#> ..$ gene : chr [1:302] "KRT5" "KRT6B" "KRT16" "C5orf46" ...
#> $ PVL :'data.frame': 302 obs. of 5 variables:
#> ..$ distance : num [1:302] 23.1 23.5 23.5 23.7 23.7 ...
#> ..$ expr.prop : num [1:302] 0.679 0.321 0.643 0.214 0.536 ...
#> ..$ expr.prop.others: num [1:302] 0.2254 0.0195 0.1881 0.034 0.2524 ...
#> ..$ label : chr [1:302] "PVL" "PVL" "PVL" "PVL" ...
#> ..$ gene : chr [1:302] "CAV1" "AVPR1A" "ZEB1" "TCEAL7" ...
#> $ Plasmablasts :'data.frame': 302 obs. of 5 variables:
#> ..$ distance : num [1:302] 33.2 33.5 34.1 34.3 36.1 ...
#> ..$ expr.prop : num [1:302] 0.586 0.448 0.897 0.552 0.483 ...
#> ..$ expr.prop.others: num [1:302] 0.00337 0.0074 0.0101 0.0074 0.01077 ...
#> ..$ label : chr [1:302] "Plasmablasts" "Plasmablasts" "Plasmablasts" "Plasmablasts" ...
#> ..$ gene : chr [1:302] "CD79A" "DERL3" "MZB1" "TNFRSF17" ...
#> $ T-cells :'data.frame': 302 obs. of 5 variables:
#> ..$ distance : num [1:302] 30.1 30.2 30.4 31 31 ...
#> ..$ expr.prop : num [1:302] 0.592 0.493 0.887 0.563 0.437 ...
#> ..$ expr.prop.others: num [1:302] 0.01332 0.00922 0.03175 0.02287 0.01127 ...
#> ..$ label : chr [1:302] "T-cells" "T-cells" "T-cells" "T-cells" ...
#> ..$ gene : chr [1:302] "CD3G" "CD3D" "CD3E" "CCL5" ...