Skip to contents

This vignette introduces the HierFabs workflow for the analysis of the skin cutaneous melanoma (SKCM) dataset downloaded from The Cancer Genome Atlas TCGA, which contains disease outcomes, environmental factors, and high-dimensional gene expressions. The goal of analysis is to identify interactions that are associated with the prognosis of SKCM.

We demonstrate the use of HierFabs to the SKCM data that are here, which can be downloaded to the current working path by the following command:

githubURL <- "https://github.com/XiaoZhangryy/HierFabs/blob/master/vignettes_data/cleaned_SKCM_TCGA_Data.rda?raw=true"
download.file(githubURL, "cleaned_SKCM_TCGA_Data.rda", mode = "wb")

The outcome of interest is the (log-transformed) Breslow’s thickness, which is a continuous variable that has been suggested as a clinicopathologic feature of cutaneous melanoma. We conduct a prescreening by the p-value of a marginal linear model, and the top 2,000 genes are selected for downstream analysis. To identify GG interactions under the weak hierarchy, we need to fit a high dimensional linear model with 2,003,000 covariates.

The package can be loaded with the command:

Then load datasets to R

load("cleaned_SKCM_TCGA_Data.rda")

Fit LM-GG

Fit a linear model with gene-gene interaction under weak hierarchy constraint. The response is the log-transformed Breslow’s thickness.

Genes = as.matrix(data$gexp)
Y = data$Y
fit <- HierFabs(Genes, Y, eps = 0.01, hier = "weak", model = "gaussian", diagonal = TRUE, criteria = "BIC")

Then, we can use the print function to show the result.

print(fit)
#> 10 x 12 sparse Matrix of class "dgCMatrix"
#>       main effect SMC3    RABGEF1  MLLT3  INPP1    SNX3     LINC00442 LPIN3 
#> SLC8A1   -0.04617 .       .        .       .       .        .         .      
#> DPYD     -0.02318 .       .        .       .       .        .         .      
#> PHIP     -0.00977 .       .        .       .       .        .         .      
#> SLC40A1  -0.01117 0.01167 .        .       .       .        .         .      
#> PARD6G    0.00822 .       0.03023  .       .       .        .         .      
#> TMEM159   0.00773 .       .       -0.01449 0.03722 .        .         .      
#> STAMBPL1 -0.01088 .       .        .       .       0.04428  .         .      
#> INPP5K    0.00809 .       .        .       .       .       -0.05904   .      
#> SERP2     0.00886 .       .        .       .       .        .         0.02399
#> NR2F1    -0.01133 .       .        .       .       .        .         .      
#>          LINC00482 GLIPR2   VPS37B  SLAMF7 
#> SLC8A1   .         .        .       .      
#> DPYD     .         .        .       .      
#> PHIP     .         .        .       .      
#> SLC40A1  .         .        .       .      
#> PARD6G   .         .        .       .      
#> TMEM159  .         .        .       .      
#> STAMBPL1 .         .        .       .      
#> INPP5K   .         .        .       .      
#> SERP2    0.00313  -0.04368  .       .      
#> NR2F1    .        .       -0.01841 0.12726

Session information

sessionInfo()
#> R version 4.1.3 (2022-03-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19043)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_United States.1252 
#> [2] LC_CTYPE=English_United States.1252   
#> [3] LC_MONETARY=English_United States.1252
#> [4] LC_NUMERIC=C                          
#> [5] LC_TIME=English_United States.1252    
#> system code page: 936
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] Matrix_1.5-1   HierFabs_0.1.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] rstudioapi_0.13   knitr_1.38        magrittr_2.0.3    lattice_0.20-45  
#>  [5] R6_2.5.1          ragg_1.2.2        rlang_1.0.6       fastmap_1.1.0    
#>  [9] stringr_1.4.1     tools_4.1.3       grid_4.1.3        xfun_0.30        
#> [13] cli_3.2.0         jquerylib_0.1.4   htmltools_0.5.3   systemfonts_1.0.4
#> [17] yaml_2.3.5        digest_0.6.29     rprojroot_2.0.3   pkgdown_2.0.6    
#> [21] textshaping_0.3.6 purrr_0.3.4       formatR_1.12      sass_0.4.2       
#> [25] fs_1.5.2          memoise_2.0.1     cachem_1.0.6      evaluate_0.15    
#> [29] rmarkdown_2.13    stringi_1.7.8     compiler_4.1.3    bslib_0.4.0      
#> [33] desc_1.4.1        jsonlite_1.8.2