Expression levels recorded for 3,571 genes in 72 patients with leukemia (Golub et al, 1999). The binary outcome encodes the disease subtype: acute lymphobastic leukemia (ALL) or acute myeloid leukemia (AML).
data(leukemia)
Data are represented as a 72 x 3,571 matrix x
of gene expression
values, and a vector y
of 72 binary disease outcomes.
These are the preprocessed data of Dettling (2004) retrieved from the supplementary materials accompanying Friedman et al (2010).
M. Dettling (2004). BagBoosting for tumor classification with gene expression data. Bioinformatics 20, 3583--3593.
J. Friedman, T. Hastie and R. Tibshirani (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33, 1--22.
T. R. Golub, et al. (1999). Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531--537.
# See demo.leukemia.R vignette.