Peter Carbonetto, Ph.D.

 
github  google profile  linkedin 

PublicationsYoungseok Kim, Peter Carbonetto, Matthew Stephens and Mihai Anitescu. A fast algorithm for maximum likelihood estimation of mixture proportions using sequential quadratic programming. To appear in Journal of Computational and Graphical Statistics. R package  code & data Luìs Felipe Ventorim Ferrão, Romário Gava Ferrão, Maria Amélia Gava Ferrão, Aymbiré Fonseca, Peter Carbonetto, Matthew Stephens and Antonio Augusto Franco Garcia. Accurate genomic prediction of Coffea canephora in multiple environments using wholegenome statistical models. Heredity volume 122, pages 261275, March 2019. Sarah Urbut, Gao Wang, Peter Carbonetto and Matthew Stephens. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nature Genetics volume 51, pages 187195, January 2019. R package  code & data Ana I. Hernandez Cordero, Peter Carbonetto, Gioia Riboni Verri, Jennifer Gregory, David Vandenbergh, Joe Gyekis, David Blizard and Arimantas Lionikas. Replication and discovery of musculoskeletal QTLs in LG/J and SM/J advanced intercross lines. Physiological Reports 6: e13561, February 2018. Peter Carbonetto, Xiang Zhou and Matthew Stephens. varbvs: fast variable selection for largescale regression. arXiv:1709.06597. code Eunjung Han*, Peter Carbonetto*, Ross Curtis, Yong Wang, Julie Granka, Jake Byrnes, Keith Noto, Amir Kermany, Natalie Myres, Mathew Barber, Kristin Rand, Shiya Song, Theodore Roman, Erin Battat, Eyal Elyashiv, Harendra Guturu, Eurie Hong, Kenneth Chahine and Catherine Ball. Clustering of 770 thousand genomes reveals postcolonial population structure of North America. Nature Communications 8: 14238, February 2017. (* indicates shared first authorship) Laura Sittig, Peter Carbonetto, Kyle Engel, Kathleen Krauss, Camila BarriosCamacho and Abraham Palmer. Genetic background limits generalizability of genotypephenotype relationships. Neuron volume 91, pages 12531259, September 2016. perspective  code & data Clarissa Parker*, Shyam Gopalakrishnan*, Peter Carbonetto*, Natalia Gonzales, Emily Leung, Yeonhee Park, Emmanuel Aryee, Joe Davis, David Blizard, Cheryl AckertBicknell, Arimantas Lionikas, Jonathan Pritchard and Abraham Palmer. Genomewide association study of behavioral, physiological and gene expression traits in outbred CFW mice. Nature Genetics volume 48, pages 919926, August 2016. (* indicates shared first authorship) code  data Laura Sittig, Peter Carbonetto, Kyle Engel, Kate Krauss and Abraham Palmer. Integration of genomewide association and extant brain expression QTL identifies candidate genes influencing prepulse inhibition in inbred F1 mice. Genes, Brain and Behavior, volume 15, pages 260270, February 2016. code and data Luisa Pallares, Peter Carbonetto, Shyam Gopalakrishnan, Clarissa Parker, Cheryl AckertBicknell, Abraham Palmer and Diethard Tautz. Mapping of craniofacial traits in outbred mice identifies major developmental genes involved in shape determination. PLoS Genetics, volume 11, November 2015. code and data Clarissa Parker*, Peter Carbonetto*, Greta Sokoloff, Yeonhee Park, Mark Abney and Abraham Palmer. Highresolution genetic mapping of complex traits from a combined analysis of F2 and advanced intercross mice. Genetics, volume 198, pages 103116, September 2014. (* indicates shared first authorship) code Peter Carbonetto, Riyan Cheng, Joseph Gyekis, Clarissa Parker, David Blizard, Abraham Palmer and Arimantas Lionikas. Discovery and refinement of muscle weight QTLs in B6 x D2 advanced intercross mice. Physiological Genomics, volume 46, pages 571582, August 2014. code Peter Carbonetto and Matthew Stephens. Integrated enrichment analysis of variants and pathways in genomewide association studies indicates central role for IL2 signaling genes in type 1 diabetes, and cytokine signaling genes in Crohn's disease. PLoS Genetics, volume 9, October 2013. Pubmed  HFSP article  code Xiang Zhou, Peter Carbonetto and Matthew Stephens. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genetics, volume 9, February 2013. Peter Carbonetto and Matthew Stephens. Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies. Bayesian Analysis, volume 7, March 2012, pages 73108. code Matthew Hoffman, Peter Carbonetto, Nando de Freitas and Arnaud Doucet. Inference strategies for solving SMDPs. NIPS Workshop on Probabilistic Approaches for Robotics and Control, December 2009. Peter Carbonetto, Matthew King and Firas Hamze. A stochastic approximation method for inference in probabilistic graphical models. Neural Information Processing Systems 23, December 2009. Peter Carbonetto, Mark Schmidt and Nando de Freitas. An interiorpoint stochastic approximation method and an L1regularized delta rule. Neural Information Processing Systems 22, December 2008. (Note: the proof of asymptotic convergence that was originally published as an appendix in the original paper has a major flaw; the convergence proof remains an open question.) slides  code Peter Carbonetto, Gyuri Dorkò, Cordelia Schmid, Hendrik Kück and and Nando de Freitas. Learning to recognize objects with little supervision. International Journal of Computer Vision, volume 77, May 2008, pages 219237. Peter Carbonetto and Nando de Freitas. Conditional mean field. Neural Information Processing Systems 19, December 2006, pages 201208. Peter Carbonetto, Jacek Kisynski, Nando de Freitas and David Poole. Nonparametric Bayesian Logic. 21st Conference on Uncertainty in Artificial Intelligence, July 2005, pages 8593. This revision corrects a mistake in Fig. 5. Peter Carbonetto, Gyuri Dorkò and Cordelia Schmid. Bayesian learning for weakly supervised object classification. Technical Report, INRIA RhôneAlpes, July 2004. Peter Carbonetto, Nando de Freitas and Kobus Barnard. A Statistical Model for General Contextual Object Recognition. 8th European Conference on Computer Vision, May 2004, part I, pages 350362.^{1} Hendrik Kück, Peter Carbonetto and Nando de Freitas. A Constrained SemiSupervised Learning Approach to Data Association. 8th European Conference on Computer Vision, May 2004, part III, pages 112.^{1} Peter Carbonetto and Nando de Freitas. Why can't José read? The problem of learning semantic associations in a robot environment. Human Language Technology Conference Workshop on Learning Word Meaning from NonLinguistic Data, June 2003. Peter Carbonetto, Nando de Freitas, Paul Gustafson and Natalie Thompson. Bayesian feature weighting for unsupervised learning, with application to object recognition. Workshop on Artificial Intelligence and Statistics, January 2003. PatentsEunjung Han, Ross E. Curtis and Peter Carbonetto. Discovering population structure from patterns of identitybydescent, April 10, 2018, U.S. Patent 9,940,433.ThesesNew probabilistic inference algorithms that harness the strengths of variational and Monte Carlo methods. Ph.D. thesis, University of British Columbia, August 2009. Unsupervised Statistical Models for General Object Recognition. Masters thesis, University of British Columbia, August 2003. CodeAdmixture. A simple EM implementation of the ADMIXTURE model in R, plus extensions. Variational inference for Bayesian variable selection in MATLAB and R. Companion code to my Bayesian Analysis (2012) paper. Includes routines for computing variational estimates of posterior statistics, and demonstrates how to run the full variational inference procedure for Bayesian variable selection in linear and logistic regression. MATLAB code for online L1 regularization. Companion code to my research paper appearing at the 2008 NIPS conference (see below for data). Includes MATLAB functions for learning linear regressors and classifiers subject to L1 regularization, which acts as a form of feature selection. The linear regression is also known in the statistics community as the LASSO. The software package includes implementations of both batch learning and online learning, when the model parameters are rapidly adjusted at each iteration by looking at only a single training example. This software is licensed under the CCGNU GPL version 2.0 or later. Semisupervised classification using a Bayesian kernel machine and data association constraints. Matlab implementation of the MCMC algorithms for simulating the Bayesian data association models described in the ECCV 2004 paper and the INRIA tech report (the data association model with hard group constraints), and Learning to classify individuals based on group statistics by Kuck and de Freitas (data association with group statistics). For a much more stable implementation in C, go here. Gaussian belief propagation. Matlab code for running belief propagation on Gaussian Markov random fields. Image Translation. Matlab package for generic object recognition using statistical translation models. See my Masters thesis for more information. Feature Weighting using Shrinkage Priors. Matlab code for running EM on a mixture of Gaussians with Bayesian feature weighting priors. Used for the paper Bayesian feature weighting for unsupervised learning. Multiple dispatch. An implementation of multiple dispatch in Java using the ELIDE framework. See here for the project report. DataTREC2005. Spam filtering data in MATLAB format. Used to evaluate my online logistic regression learning algorithm in the paper An interiorpoint stochastic approximation method and an L1regularized delta rule. This data set was originally created by Gordon Cormack and Thomas Lynam as part of the 2005 TREC Spam Filter Evaluation Tool Kit, and contains data from 92,189 emails. The open source software SpamBayes was used to extract features from the emails. By downloading and using this data, you accept the terms of agreement for use of the 2005 TREC public spam corpus. Corel. Object recognition data used for my Masters thesis and the paper A Statistical Model for General Contextual Object Recognition. Contains manual segmentations for evaluation and extracted featres. The Image Translation package contains code for reading the data into Matlab. Robomedia. Object recognition data used for the Why can't José read? paper. Contains manual segmentations for evaluation and extracted featres. The Image Translation package contains code for reading the data into Matlab. Face detection. Training data for robust object detection using the AdaBoost algorithm, as formalized by Viola and Jones. Includes Matlab code for reading the data. The project report is available here. Other workMATLAB interface for PARDISO. PARDISO is a publicly available software library for solving large, sparse linear systems. It is particularly useful as a subroutine for interiorpoint methods. I designed a small interface so that the PARDISO solver is easily incorporated into your MATLAB programs. MATLAB class for limitedmemory BFGS. This little MATLAB class I wrote encapsulates all the functionality of limitedmemory quasiNewton methods. It is particularly wellsuited for solving constrained optimization problems; I illustrate how it it is used within a primaldual interiorpoint method for solving a constrained optimization problem that arises in maximum likelihood estimation. See here for more details on installing and using this software. Intuition behind primaldual interiorpoint methods for linear and quadratic programming. I'm quite aware of the fact that there are probably a hundred textbooks published every year that contain an introduction to linear programming, and there are many introductory presentations on interiorpoint methods. But I find they are all lacking in providing the key intuition. So I've written a short 7page document which I'm confident fills a tiny bit of the void. MATLAB code for solving constrained, convex programs. I wrote a simple, easytouse MATLAB function for minimizing a convex objective subject to convex inequality constraints. It uses a primaldual interiorpoint method with a suitable merit function for ensuring global convergence (which is useful when it is not desirable to compute the Newton step using the full Hessian of the objective). MATLAB code for secondorder cone programming. I also implemented a simple primaldual interiorpoint method in MATLAB for solving secondorder cone programs. At each iteration, the solver follows the Newton search direction and makes sure that the iterates remain feasible (they satisfy all the inequality constraints). MATLAB interface for IPOPT. IPOPT is a fantastic, new open source software package written in C++ for solving optimization problems with nonlinear objectives and subject to nonlinear constraints. IPOPT is short for Interior Point Optimizer. I've developed an interface so that IPOPT can be easily called from the MATLAB programming environment. You can download the current version of IPOPT from the project website. Notes on probabilistic decoding of parity check matrices. A review of the basic concepts behind lowdensity parity check codes, and how to come up with a simple and reasonable method for probabilistic decoding. Assumes some familiarity with some ideas in statistical machine learning concepts and optimization. A MATLAB interface for LBFGSB, a solver for boundconstrained nonlinear optimization problems that uses quasiNewton updates with a limitedmemory approximation to the Hessian. A nonrigorous derivation of a variational upper bound on the logpartition function in eight parts. This is a brief exposé of Martin Wainwright's derivation of a convex alternative to generalized belief propagation (resulting in the socalled treereweighted belief propagation algorithm). The intent is to present the main mathematical steps in the derivation while keeping the presentation as "light" as possible. Installing IPOPT on Mac OS X. Some of my experiences. Creating, compiling and linking MATLAB executables (MEX files). A tutorial. A Lesson in measure theory and change of variables. A technical note illustrating and explaining the subtleties in deriving a correct kernel for the snooker move used in population Monte Carlo. Project webpage for Learning to recognize objects with little supervision. How to partition and format an external hard drive for Mac OS X. Note^{1} © SpringerVerlag. Published in the SpringerVerlag Lecture Notes in Computer Science series.
