This function tests whether pathways are enriched. Specifically, for a pathway (gene set), the function will assess whether these genes are clustered in the embedding space. For more details, see Bai and Chu (2023).

CAESAR.enrich.pathway(
  seu,
  pathway.list,
  reduction = "caesar",
  pathway.cutoff = 3,
  test.type = list("ori", "gen", "wei", "max"),
  k = 5,
  wei.fun = c("weiMax", "weiGeo", "weiArith"),
  perm.num = 0,
  progress_bar = TRUE,
  ncores = 10,
  eta = 1e-04,
  genes.use = NULL,
  parallel = TRUE
)

Arguments

seu

A Seurat object containing co-embedding.

pathway.list

A list of pathways, where each pathway is characterized as a vector of gene sets.

reduction

The embedding used when a Seurat object is provided. Default is "caesar".

pathway.cutoff

The minimal number of genes required in a pathway. Pathways with fewer genes than this threshold will not be considered in the enrichment test.

test.type

Type of graph-based test. This must be a list containing elements chosen from "ori", "gen", "wei", and "max", with default `list("ori", "gen", "wei", "max")`. "ori" refers to robust original edge-count test, "gen" refers to robust generalized edge-count test, "wei" refers to robust weighted edge-count test, and "max" refers to robust max-type edge-count tests. For more details, see Bai and Chu (2023).

k

The parameter for k-minimum spanning tree. Default is 5.

wei.fun

The weighted function used in the enrichment test. Default is "weiMax", which returns the inverse of the max node degree of an edge. "weiGeo" returns the inverse of the geometric average of the node degrees, and "weiArith" returns the inverse of the arithmetic average of the node degrees.

perm.num

The number of permutations used to calculate the p-value (default is 0). Use 0 for getting only the approximate p-value based on asymptotic theory. Setting perm.num (e.g., perm.num = 1000) allows permutation-based p-value calculation, though this may be time-consuming.

progress_bar

Logical, TRUE or FALSE, indicating whether a progress bar should be printed during permutations. Default is TRUE.

ncores

The number of cores to use for parallel computing. Default is 10.

eta

A small positive number to ensure matrix inversion stability. Default is 1e-4.

genes.use

A vector of genes representing a gene set. All pathways will be tested for enrichment after intersecting with this gene set.

parallel

Logical, indicating whether to use parallel computing to speed up computation. Default is TRUE.

Value

A data.frame containing the results of the test with the following columns:

  • asy.ori.statistic - test statistic of robust original edge-count test.

  • asy.ori.pval - asymptotic theory based p value of robust original edge-count test.

  • asy.ori.pval.adj - the adjusted asymptotic theory based p value of robust original edge-count test.

  • perm.ori.pval - permutation-based p value of robust original edge-count test, appear when permutation-based p-value is calculated.

  • perm.ori.pval.adj - the adjusted permutation-based p value of robust original edge-count test, appear when permutation-based p-value is calculated.

  • asy.gen.statistic - test statistic of robust generalized edge-count test.

  • asy.gen.pval - asymptotic theory based p value of robust generalized edge-count test.

  • asy.gen.pval.adj - the adjusted asymptotic theory based p value of robust generalized edge-count test.

  • perm.gen.pval - permutation-based p value of robust generalized edge-count test, appear when permutation-based p-value is calculated.

  • perm.gen.pval.adj - the adjusted permutation-based p value of robust generalized edge-count test, appear when permutation-based p-value is calculated.

  • asy.wei.statistic - test statistic of robust weighted edge-count test.

  • asy.wei.pval - asymptotic theory based p value of robust weighted edge-count test.

  • asy.wei.pval.adj - the adjusted asymptotic theory based p value of robust weighted edge-count test.

  • perm.wei.pval - permutation-based p value of robust weighted edge-count test, appear when permutation-based p-value is calculated.

  • perm.wei.pval.adj - the adjusted permutation-based p value of robust weighted edge-count test, appear when permutation-based p-value is calculated.

  • asy.max.statistic - test statistic of robust max-type edge-count tests.

  • asy.max.pval - asymptotic theory based p value of robust max-type edge-count tests.

  • asy.max.pval.adj - the adjusted asymptotic theory based p value of robust max-type edge-count tests.

  • perm.max.pval - permutation-based p value of robust max-type edge-count tests, appear when permutation-based p-value is calculated.

  • perm.max.pval.adj - the adjusted permutation-based p value of robust max-type edge-count tests, appear when permutation-based p-value is calculated.

References

Bai, Y., & Chu, L. (2023). A Robust Framework for Graph-based Two-Sample Tests Using Weights. arXiv preprint arXiv:2307.12325.

Examples

data(toydata)

seu <- toydata$seu
pathway_list <- toydata$pathway_list

CAESAR.enrich.pathway(seu, pathway_list)
#> Only the approximate p-values based on asymptotic theory are calculated as perm.num is set as 0.
#>                              asy.ori.statistic asy.ori.pval asy.gen.statistic
#> GOBP_VASCULATURE_DEVELOPMENT         -1.854919   0.03180388          23.06323
#> GOBP_T_CELL_ACTIVATION               -1.235158   0.10838588          38.48264
#>                              asy.gen.pval asy.wei.statistic asy.wei.pval
#> GOBP_VASCULATURE_DEVELOPMENT 9.814855e-06          4.669963 1.506273e-06
#> GOBP_T_CELL_ACTIVATION       4.401508e-09          5.333820 4.808396e-08
#>                              asy.max.statistic asy.max.pval gene.set.length
#> GOBP_VASCULATURE_DEVELOPMENT          4.669963 4.518814e-06              37
#> GOBP_T_CELL_ACTIVATION                5.333820 1.442519e-07              48
#>                              asy.ori.pval.adj asy.gen.pval.adj asy.wei.pval.adj
#> GOBP_VASCULATURE_DEVELOPMENT       0.06360776     9.814855e-06     1.506273e-06
#> GOBP_T_CELL_ACTIVATION             0.10838588     8.803015e-09     9.616793e-08
#>                              asy.max.pval.adj
#> GOBP_VASCULATURE_DEVELOPMENT     4.518814e-06
#> GOBP_T_CELL_ACTIVATION           2.885038e-07