A hierarchical Forward and Backward Stagewise (HierFabs) algorithm for identifying hierachical interaction of genomics data.
Source:R/HierFabs.r
HierFabs.Rd
A hierarchical Forward and Backward Stagewise (HierFabs) algorithm for identifying hierachical interaction of genomics data.
Usage
HierFabs(
G,
y,
E,
weight = NULL,
model = c("gaussian", "cox", "quantile", "logistic"),
back = TRUE,
stoping = TRUE,
eps = 0.01,
xi = 10^-6,
iter = 3000,
lambda.min = NULL,
hier = c("strong", "weak"),
max_s = NULL,
diagonal = FALSE,
status = NULL,
gamma = NULL,
tau = NUL,
criteria = c("EBIC", "BIC")
)
Arguments
- G
Gene matrix, each row is an observation vector.
- y
Response variable. For logistic regression model, y takes value at 1 and -1.
- E
An optional environment matrix. If Z is given, the interactions between environment and gene are of interest. Otherwise, the gene-gene interactions are of interest.
- weight
An optional weights. Default is 1 for each observation.
- model
A character string representing one of the built-in models. 'gaussian' for linear model and 'cox' for cox model.
- back
An indicator of whether to take backward steps.
- stoping
An indicator of whether to stop iteration when lambda is less than lambda.min.
- eps
Step size. Default is 0.01.
- xi
A tolerate to improve program stability. Default is 10^-6.
- iter
Maximum number of outer-loop iterations allowed. Default is 3000.
- lambda.min
Smallest value for lambda, as a fraction of lambda.max.
- hier
Whether to enforce strong or weak heredity. Default is 'strong'.
- max_s
Limit the maximum number of variables in the model. When exceed this limit, program will early stopped.
- diagonal
An indicator of whether to include "pure" quadratic terms. Work when gene-gene interactions are of interest.
- status
A censoring indicator.
- gamma
A tuning parameter in EBIC.
- tau
parameter for quantile regression.
- criteria
The criteria used to select the optimal solution.
Value
A list.
theta - The coefficients of covariates, each column is a solution.
beta - The optimal solution.
lambda - Lambda sequence.
direction - Update indicator. 1 means take a forward step, -1 means take a backward step.
iter - Number of iterations.
EBIC - EBIC for each solution.
loss - loss for each solution.
df - Number of nonzero coefficients.
opt - Position of the optimal tuning based on EBIC.
intercept - The intercept term, which appearance is due to standardization.
Examples
set.seed(0)
n = 500
p = 100
x = matrix(rnorm(n*p),n,p)
eta = x[,1:4] %*% rep(1,4) + 3*x[,1]*x[,2] + 3*x[,1]*x[,4]
y = eta + rnorm(n)
xtest = matrix(rnorm(n*p),n,p)
eta.test = xtest[,1:4] %*% rep(1,4) + 3*xtest[,1]*xtest[,2] + 3*xtest[,1]*xtest[,4]
ytest = eta.test + rnorm(n)
fit.gg.strong = HierFabs(x, y)
y.pred.gg.s = predict(fit.gg.strong, xtest, ytest)
y.pred.gg.s$mse
#> [1] 1.060471
print(fit.gg.strong)
#> 4 x 3 sparse Matrix of class "dgCMatrix"
#> main effect G2 G4
#> G1 0.86902 2.94975 2.95852
#> G2 0.84403 . .
#> G3 0.83112 . .
#> G4 0.93636 . .
## Weak hierarchy
fit.gg.weak = HierFabs(x, y, hier="weak")
y.pred.gg.w = predict(fit.gg.weak, xtest, ytest)
y.pred.gg.w$mse
#> [1] 1.107333
print(fit.gg.weak)
#> 4 x 3 sparse Matrix of class "dgCMatrix"
#> main effect G2 G4
#> G1 0.83871 2.92996 2.93195
#> G2 0.81424 . .
#> G3 0.80107 . .
#> G4 0.90827 . .
## Cox model with Gene-Environment interactions
pz = 10
z = matrix(rnorm(n*pz),n,pz)
eta.ge = x[,1:4] %*% rep(1,4) + z[,1] + z[,2] + 3*x[,1]*z[,1] + 3*x[,2]*z[,2]
err = log(rweibull(n, shape = 1, scale = 1))
y0 = exp(-eta.ge + err)
cens = quantile(y0, 0.9)
y.ge = pmin(y0, cens)
status = 1 * (y0<=cens)
ztest = matrix(rnorm(n*pz),n,pz)
eta.ge.test = rowSums(xtest[,1:4]) + ztest[,1] + ztest[,2]
eta.ge.test = eta.ge.test + 3*xtest[,1]*ztest[,1] + 3*xtest[,2]*ztest[,2]
err.test = log(rweibull(n, shape = 1, scale = 1))
y0.test = exp(-eta.ge.test + err.test)
cens = quantile(y0.test, 0.9)
y.ge.test = pmin(y0.test, cens)
status.test = 1 * (y0.test<=cens)
fit.ge.strong = HierFabs(x, y.ge, z, model="cox", status=status)
y.pred.ge.s = predict(fit.ge.strong, xtest, y.ge.test, ztest, status.test)
y.pred.ge.s$c.index
#> [1] 0.914657
print(fit.ge.strong)
#> 7 x 3 sparse Matrix of class "dgCMatrix"
#> main effect E1 E2
#> main effect . 0.63192 0.64792
#> G1 0.60630 1.88923 .
#> G2 0.54614 . 1.92648
#> G3 0.64086 . .
#> G4 0.60864 . .
#> G24 0.02830 . .
#> G53 0.02939 . .
## Weak hierarchy
fit.ge.weak = HierFabs(x, y.ge, z, model="cox", status=status, hier="weak")
y.pred.ge.w = predict(fit.ge.weak, xtest, y.ge.test, ztest, status.test)
y.pred.ge.w$c.index
#> [1] 0.9148998
print(fit.ge.weak)
#> 9 x 3 sparse Matrix of class "dgCMatrix"
#> main effect E1 E2
#> main effect . 0.61186 0.61755
#> G1 0.57598 1.82142 .
#> G2 0.52628 . 1.86564
#> G3 0.62083 . .
#> G4 0.58055 . .
#> G24 0.01887 . .
#> G53 0.01959 . .
#> G32 . 0.00994 .
#> G59 . 0.02020 .