Learning to recognize objects with little supervision
This is the project webpage that accompanies the journal submission with the same name. See below for published papers relevant to this project. Here you will find the code and data used in the experiments. This webpage is maintained by Peter Carbonetto.
We used six different databases to evaluate our proposed Bayesian model. Five of them were collected and made publicly available by other researchers. They are: airplanes, motorbikes, wildcats, bicycles and people. The wildcats database is not publicly available since it was created using the commercial Corel images database. We created the sixth and final data set. It consists of photos of parking lots and cars near the INRIA Rhône-Alpes research centre in Montbonnot, France. The INRIA car database is available for download here. We now describe how to read in and use it.
In the root directory, there are two files trainimages and testimages. Each one contains a list of image names, one per line. All the images are contained, appropriately enough, in the images folder with the file names appended with the .jpg extension.
In addition, we have provided manual annotations of the scenes which consist of boxes ("windows") that surround the objects. The files describing the scene annotations are contained in the objects subdirectory, appended with a .pgm.objects suffix.Each line describes a single window and looks like this:
Object: x y width height
CodeOur approach consists of three steps. First, we use detector to extract a sparse set of a priori informative interest regions. Second, we train the Bayesian classification model using a Markov Chain Monte Carlo algorithm. Third, for object localization, we run inference on a conditional random field. The code for the three steps is described and made available for download below.
Interest region detectorsThe three detectors we employed were developed elsewhere. Binaries for the Harris-Laplace and Laplacian of Gaussian interest region detectors are available for download here. You can find the Kadir-Brady entropy detector at Timor Kadir's website. Once the regions are extracted, you still need a way of describing the regions in a way that our model will understand. We use the Scale Invariant Feature Transform (SIFT) descriptor. With our parameter settings, each interest region ends up as a 128-dimension feature vector. The same package as above can be used to compute the SIFT feature vectors.
MCMC algorithm for Bayesian classification
We have a C implementation of the Markov Chain Monte Carlo (MCMC) algorithm for simulating the posterior of the Bayesian kernel machine classifier given some training data. It was tested in Linux. Here is how to compile and install the code.
First, you need to install a copy of the GNU Scientific Library (GSL). Our code was testted with GSL version 1.6. The libraries should be installed in the directory $HOME/gsl/lib and the header files in $HOME/gsl/include, and the variable $HOME must be entered correctly in the Makefile (see below). In addition, if you use the gcc compiler in Linux you have to set the path to include the installed libraries with the command
Next, you're ready to install the ssmcmc package. "ssmcmc" stands for "semi-supervised MCMC." As mentioned, you have to edit the file Makefile and make sure that the HOME variable points to the right directory. Once you're in the ssmcmc directory, type make, and after a few seconds you should have a program called ssmcmc. Running the program without any input arguments gives to the help. We give a brief tutorial explaining how to use the program.
In order to train the model on some data, you need a few ingredients. First, you need some data in the proper format. We've made a sample training set available for download. In fact, this particular data set was used for many of our experiments. It was produced by extracting Harris-Laplace interest regions from the INRIA car data set (an average of 100 regions per image) then converting them to feature vectors using SIFT. The format of the data is as follows:
Suppose you have your data set available. Next, you need to specify
the parameters for the model in a text file. A sample parameters file
looks like this:
Put comment here. ns: 1000 metric: fdist2 kernel: kgaussian lambda: 0.01 mu: 0.01 nu: 0.01 a: 1.0 b: 50.0 mua: 0.01 nua: 0.01 epsilon: 0.1 nc1: 30 nc2: 0Most of the parameters above are explained in the journal paper submission and technical report. ns is the number of samples to generate. There is only one possible distance metric and kernel, but they must be specified anyway. lambda is the kernel scale parameter and epsilon is the stabilization term on the covariance prior.
The parameters nc1 and nc2 specify the minimum number
of positive labels and the minimum number of negative labels in a
training image, respectively, so this obviously specifies a
constrained data association model. In this case, the constraints
require that at least 30 interest regions in a training image be
labeled as positive. Alternatively, one can specify data association
problem using group statistics, in which case the last two lines are
replaced by something like
m: 0.3 chi: 400
Once the model parameters are specified, you can finally train the
model with the following command:
We're assuming here at that the parameters file is called params. The result is saved in the file model. Once training is complete (it might take a little while), you can use the model to predict the labels of interest regions extracted from any image, including this sample test set, with the following command:
This specifies a burn-in of 100, so the first hundred samples are discarded. The resulting predictions, along with the level of confidence in those predictions, is saved in the file labels. Each line in the file has two numbers: the probability of a positive classification (the interest region belongs to the object) and the probability of a negative classification (the interest region is associated with the background). These two numbers should add up to one. Here is a couple examples from that sample test set.
The blue interest regions are more likely to belong to a car (greather than 0.5 chance).
Conditional random field for localization
We implemented the CRF model for localization in Matlab. The function is called crflocalize. Type help crflocalize in the Matlab command line to get instructions on how to use it.
The function crflocalize requires the optimized Matlab implementation of the random schedule tree sampler, bgsfast. Once you have downloaded and unpacked the tar ball, follow these steps:
Here are a couple of localization results for the cars database.
NoteIf you have any questions or problems to report about this project, do not hesitate to contact the main author.
Hendrik Kück and Nando de Freitas. Learning to classify individuals based on group statistics. Conference on Uncertainty in Artificial Intelligence, July 2005.
Peter Carbonetto, Gyuri Dorkò and Cordelia Schmid. Bayesian learning for weakly supervised object classification. Technical Report, INRIA Rhône-Alpes, July 2004.
Hendrik Kück, Peter Carbonetto and Nando de Freitas. A Constrained semi-supervised learning approach to data association. European Conference on Computer Vision, May 2004.
This webpage was last updated on August 13, 2005. Home.