This function identifies signature genes for each cell type or cell group in a Seurat object using a co-embedding distance-based approach. It computes the average expression and distance metrics for each gene across different groups, while also considering expression proportions.

find.sig.genes(
  seu,
  distce.assay = "distce",
  ident = NULL,
  expr.prop.cutoff = 0.1,
  assay = NULL,
  genes.use = NULL
)

Arguments

seu

A Seurat object containing gene expression data.

distce.assay

A character string specifying the assay that contains the distance matrix or distance-related data. Default is "distce".

ident

A character string specifying the column name in the `meta.data` slot of the Seurat object used to define the identities (clusters or cell groups). If `NULL`, the default identities (`Idents(seu)`) will be used. Default is `NULL`.

expr.prop.cutoff

A numeric value specifying the minimum proportion of cells that must express a gene for it to be considered. Default is 0.1.

assay

A character string specifying the assay to use for expression data. If `NULL`, the default assay of the Seurat object will be used. Default is `NULL`.

genes.use

A character vector specifying the genes to use for the analysis. If `NULL`, all genes in the `distce.assay` assay will be used. Default is `NULL`.

Value

A list where each element corresponds to a cell group and contains a data frame with the following columns:

distance

The mean distance of the gene across the cells in the group.

expr.prop

The proportion of cells in the group expressing the gene.

expr.prop.others

The proportion of cells in other groups expressing the gene.

label

The identity label of the cell group.

gene

The gene name.

See also

None

Examples

data(toydata)

seu <- toydata$seu

seu <- ProFAST::pdistance(seu, reduction = "caesar")
#> Calculate co-embedding distance...
sglist <- find.sig.genes(
    seu = seu
)
str(sglist)
#> List of 8
#>  $ CAFs             :'data.frame':	302 obs. of  5 variables:
#>   ..$ distance        : num [1:302] 23.5 23.5 23.6 23.6 23.6 ...
#>   ..$ expr.prop       : num [1:302] 0.551 0.644 0.575 0.57 0.302 ...
#>   ..$ expr.prop.others: num [1:302] 0.0693 0.1257 0.1257 0.0819 0.051 ...
#>   ..$ label           : chr [1:302] "CAFs" "CAFs" "CAFs" "CAFs" ...
#>   ..$ gene            : chr [1:302] "FBLN1" "CCDC80" "PDGFRA" "PTGDS" ...
#>  $ Cancer Epithelial:'data.frame':	302 obs. of  5 variables:
#>   ..$ distance        : num [1:302] 16.8 16.8 16.8 16.8 16.8 ...
#>   ..$ expr.prop       : num [1:302] 0.352 0.883 0.903 0.956 0.936 ...
#>   ..$ expr.prop.others: num [1:302] 0.0674 0.2273 0.2422 0.3245 0.2845 ...
#>   ..$ label           : chr [1:302] "Cancer Epithelial" "Cancer Epithelial" "Cancer Epithelial" "Cancer Epithelial" ...
#>   ..$ gene            : chr [1:302] "LYPD3" "MLPH" "AGR3" "ESR1" ...
#>  $ Endothelial      :'data.frame':	302 obs. of  5 variables:
#>   ..$ distance        : num [1:302] 22.9 23 23.1 23.1 23.1 ...
#>   ..$ expr.prop       : num [1:302] 0.476 0.306 0.755 0.271 0.537 ...
#>   ..$ expr.prop.others: num [1:302] 0.01552 0.00902 0.04583 0.01732 0.04006 ...
#>   ..$ label           : chr [1:302] "Endothelial" "Endothelial" "Endothelial" "Endothelial" ...
#>   ..$ gene            : chr [1:302] "HOXD9" "EGFL7" "VWF" "SOX17" ...
#>  $ Myeloid          :'data.frame':	302 obs. of  5 variables:
#>   ..$ distance        : num [1:302] 27.6 27.7 27.8 27.9 27.9 ...
#>   ..$ expr.prop       : num [1:302] 0.567 0.525 0.571 0.882 0.542 ...
#>   ..$ expr.prop.others: num [1:302] 0.0442 0.0453 0.0206 0.122 0.0529 ...
#>   ..$ label           : chr [1:302] "Myeloid" "Myeloid" "Myeloid" "Myeloid" ...
#>   ..$ gene            : chr [1:302] "HAVCR2" "IGSF6" "ITGAX" "FCER1G" ...
#>  $ Normal Epithelial:'data.frame':	302 obs. of  5 variables:
#>   ..$ distance        : num [1:302] 22.8 23 23.1 23.1 23.1 ...
#>   ..$ expr.prop       : num [1:302] 0.635 0.368 0.401 0.153 0.554 ...
#>   ..$ expr.prop.others: num [1:302] 0.1389 0.042 0.0813 0.0219 0.0969 ...
#>   ..$ label           : chr [1:302] "Normal Epithelial" "Normal Epithelial" "Normal Epithelial" "Normal Epithelial" ...
#>   ..$ gene            : chr [1:302] "KRT5" "KRT6B" "KRT16" "C5orf46" ...
#>  $ PVL              :'data.frame':	302 obs. of  5 variables:
#>   ..$ distance        : num [1:302] 23.1 23.5 23.5 23.7 23.7 ...
#>   ..$ expr.prop       : num [1:302] 0.679 0.321 0.643 0.214 0.536 ...
#>   ..$ expr.prop.others: num [1:302] 0.2254 0.0195 0.1881 0.034 0.2524 ...
#>   ..$ label           : chr [1:302] "PVL" "PVL" "PVL" "PVL" ...
#>   ..$ gene            : chr [1:302] "CAV1" "AVPR1A" "ZEB1" "TCEAL7" ...
#>  $ Plasmablasts     :'data.frame':	302 obs. of  5 variables:
#>   ..$ distance        : num [1:302] 33.2 33.5 34.1 34.3 36.1 ...
#>   ..$ expr.prop       : num [1:302] 0.586 0.448 0.897 0.552 0.483 ...
#>   ..$ expr.prop.others: num [1:302] 0.00337 0.0074 0.0101 0.0074 0.01077 ...
#>   ..$ label           : chr [1:302] "Plasmablasts" "Plasmablasts" "Plasmablasts" "Plasmablasts" ...
#>   ..$ gene            : chr [1:302] "CD79A" "DERL3" "MZB1" "TNFRSF17" ...
#>  $ T-cells          :'data.frame':	302 obs. of  5 variables:
#>   ..$ distance        : num [1:302] 30.1 30.2 30.4 31 31 ...
#>   ..$ expr.prop       : num [1:302] 0.592 0.493 0.887 0.563 0.437 ...
#>   ..$ expr.prop.others: num [1:302] 0.01332 0.00922 0.03175 0.02287 0.01127 ...
#>   ..$ label           : chr [1:302] "T-cells" "T-cells" "T-cells" "T-cells" ...
#>   ..$ gene            : chr [1:302] "CD3G" "CD3D" "CD3E" "CCL5" ...