Find the "best" fully-factorized approximation to the
posterior distribution of the coefficients, with linear regression
likelihood and mixture-of-normals priors on the coefficients. By
"best", we mean the approximating distribution that locally minimizes
the Kullback-Leibler divergence between the approximating distribution
and the exact posterior. In the original formulation (see
varbvs
), each regression coefficient was drawn
identically from a spike-and-slab prior. Here, we instead formulate
the ``slab'' as a mixture of normals.
varbvsmix(X, Z, y, sa, sigma, w, alpha, mu, update.sigma, update.sa, update.w, w.penalty, tol = 1e-4, maxiter = 1e4, verbose = TRUE)
X | n x p input matrix, where n is the number of samples, and p is the number of variables. X cannot be sparse, and cannot have any missing values (NA). |
---|---|
Z | n x m covariate data matrix, where m is the number of
covariates. Do not supply an intercept as a covariate (i.e., a
column of ones), because an intercept is automatically included in
the regression model. For no covariates, set |
y | Vector of length n containing values of the continuous outcome. |
sa | Vector specifying the prior variance of the regression
coefficients (scaled by |
sigma | Residual variance parameter. If missing, it is automatically fitted to the data by computing an approximate maximum-likelihood estimate. |
w | If missing, it is automatically fitted to the data by computing an approximate maximum-likelihood estimate. |
alpha | Initial estimates of the approximate posterior mixture assignment probabilities. These should be specified as a p x K matrix, where K is the number of mixture components. Each row must add up to 1. |
mu | Initial estimates of the approximate regression coefficients conditioned on being drawn from each of the K mixture components. These estimates should be provided as a p x K matrix, where K is the number of mixture components. |
update.sigma | If |
update.sa | Currently, estimate of mixture component variances is
not implemented, so this must be set to |
update.w | If |
w.penalty | Penalty term for the mixture weights. It is useful
for "regularizing" the estimate of |
tol | Convergence tolerance for co-ordinate ascent updates. |
maxiter | Maximum number of co-ordinate ascent iterations. |
verbose | If |
See https://www.overleaf.com/8954189vvpqnwpxhvhq.
An object with S3 class c("varbvsmix","list")
.
Number of data samples used to fit model.
Posterior mean regression coefficients for covariates, including intercept.
If TRUE
, residual variance parameter
sigma
was fit to data.
If TRUE
, mixture variances were fit to data.
If TRUE
, mixture weights were fit to data.
Penalty used for updating mixture weights.
Fitted or user-specified residual variance parameter.
User-specified mixture variances.
Fitted or user-specified mixture weights.
Variational estimates of posterior mixture assignent probabilities.
Variational estimates of posterior mean coefficients.
Variational estimates of posterior variances.
Variational lower bound to marginal log-likelihood at each iteration of the co-ordinate ascent algorithm.
Maximum difference in the variational posterior probabilities at each iteration of the co-ordinate ascent algorithm.
# Generate the data set. set.seed(1) n <- 200 p <- 500 X <- randn(n,p) sd <- c(0,0.2,0.5) w <- c(0.9,0.05,0.05) k <- sample(length(w),p,replace = TRUE,prob = w) beta <- sd[k] * rnorm(p) y <- c(X %*% beta + rnorm(n)) # Fit the model to the data. fit <- varbvsmix(X,NULL,y,sd^2) ## Not run: ------------------------------------ # library(lattice) # print(xyplot(beta.est ~ beta.true, # data.frame(beta.true = beta, # beta.fitted = rowSums(fit$alpha * fit$mu)), # pch = 20,col = "royalblue",cex = 1)) ## ---------------------------------------------