HierFabs: SKCM Data Analysis with Cox-GE model
Xiao Zhang
2022-10-22
Source:vignettes/HierFabs.GoxGE.Rmd
HierFabs.GoxGE.Rmd
This vignette introduces the HierFabs
workflow for the
analysis of the skin cutaneous melanoma (SKCM) dataset downloaded from
The Cancer Genome Atlas TCGA, which contains disease
outcomes, environmental factors, and high-dimensional gene expressions.
The goal of analysis is to identify interactions that are associated
with the prognosis of SKCM.
We demonstrate the use of HierFabs
to the SKCM data that
are here,
which can be downloaded to the current working path by the following
command:
githubURL <- "https://github.com/XiaoZhangryy/HierFabs/blob/master/vignettes_data/cleaned_menaloma_Data.rda?raw=true"
download.file(githubURL, "cleaned_menaloma_Data.rda", mode = "wb")
The outcome of interest is overall survival. After removing samples with missing survival time and genes with minimal expression variations, we obtain 17,944 gene expressions on 253 patients. For gene expression measurements, the top 2,000 are screened out for downstream analysis (Jiang et al. 2016). Each gene expression is standardized to have mean zero and standard deviation one.
The package can be loaded with the command:
Then load datasets to R
load("cleaned_menaloma_Data.rda")
Fit Cox-GE
Fit a Cox model with gene-environment interaction under weak hierarchy constraint.
y = data$y
status = data$status
E = as.matrix(data$E)
G = as.matrix(data$G)
fit <- HierFabs(E, y, G, model = "cox", eps = 0.01, hier = "weak", status = status, diagonal = TRUE,
max_s = 40, criteria = "BIC")
Then, we can use the print
function to show the
result.
print(fit)
#> 5 x 20 sparse Matrix of class "dgCMatrix"
#> main effect CREG1 ZNF25 NXT2 MDP1
#> main effect . -0.00814 . . .
#> tumor_status 0.92192 . -0.17667 -0.05527 -0.11836
#> breslow_thickness_at_diagnosis 0.00207 . . . .
#> age_at_diagnosis 0.00500 . . . .
#> new_tumor_event_dx_indicator . -0.03493 . . .
#>
#> RPS6KA3 TBC1D19 CSMD1 IFITM1 TBRG1
#> main effect . . . . .
#> tumor_status -0.01062 -0.05482 -0.01383 -0.10996 -0.0125
#> breslow_thickness_at_diagnosis . . . . .
#> age_at_diagnosis . . . . .
#> new_tumor_event_dx_indicator . . . . .
#>
#> CARS PTGR2 TSG101 CCT6B VPS26A
#> main effect . . . . .
#> tumor_status -0.13057 -0.01511 -0.02738 -0.01543 -0.03612
#> breslow_thickness_at_diagnosis . . . . .
#> age_at_diagnosis . . . . .
#> new_tumor_event_dx_indicator . . . . .
#>
#> ADAM21 TMEM27 SNX14 TMC2 GNL3L
#> main effect . . . . .
#> tumor_status -0.01176 -0.01233 . . .
#> breslow_thickness_at_diagnosis . . -0.00461 -0.12921 .
#> age_at_diagnosis . . . . 8e-04
#> new_tumor_event_dx_indicator . . . . .
Session information
sessionInfo()
#> R version 4.1.3 (2022-03-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19043)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=English_United States.1252
#> [2] LC_CTYPE=English_United States.1252
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.1252
#> system code page: 936
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] Matrix_1.5-1 HierFabs_0.1.0
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.13 knitr_1.38 magrittr_2.0.3 lattice_0.20-45
#> [5] R6_2.5.1 ragg_1.2.2 rlang_1.0.6 fastmap_1.1.0
#> [9] stringr_1.4.1 tools_4.1.3 grid_4.1.3 xfun_0.30
#> [13] cli_3.2.0 jquerylib_0.1.4 htmltools_0.5.3 systemfonts_1.0.4
#> [17] yaml_2.3.5 digest_0.6.29 rprojroot_2.0.3 pkgdown_2.0.6
#> [21] textshaping_0.3.6 purrr_0.3.4 formatR_1.12 sass_0.4.2
#> [25] fs_1.5.2 memoise_2.0.1 cachem_1.0.6 evaluate_0.15
#> [29] rmarkdown_2.13 stringi_1.7.8 compiler_4.1.3 bslib_0.4.0
#> [33] desc_1.4.1 jsonlite_1.8.2