## Learning to recognize objects with little supervisionThis is the project webpage that accompanies the journal submission with the same name. See below for published papers relevant to this project. Here you will find the code and data used in the experiments. This webpage is maintained by Peter Carbonetto. ## PeopleThe following people were involved in this project: Peter Carbonetto, Gyuri Dorkò, Cordelia Schmid, Nando de Freitas and Hendrik Kück. ## DataWe used six different databases to evaluate our proposed Bayesian model. Five of them were collected and made publicly available by other researchers. They are: airplanes, motorbikes, wildcats, bicycles and people. The wildcats database is not publicly available since it was created using the commercial Corel images database. We created the sixth and final data set. It consists of photos of parking lots and cars near the INRIA Rhône-Alpes research centre in Montbonnot, France. The INRIA car database is available for download here. We now describe how to read in and use it. In the root directory, there are two files In addition, we have provided manual annotations of the scenes
which consist of boxes ("windows") that surround the
objects. The files describing the scene annotations are contained in
the Object: x y width heightwhere (x,y) is the top-left corner of the box. Note that the coordinate system starts at 0, not 1 as in Matlab. Here is a function loadobjectwindows for loading the windows from a objects file into Matlab. ## CodeOur approach consists of three steps. First, we use detector to extract a sparse set of a priori informative interest regions. Second, we train the Bayesian classification model using a Markov Chain Monte Carlo algorithm. Third, for object localization, we run inference on a conditional random field. The code for the three steps is described and made available for download below.## Interest region detectorsThe three detectors we employed were developed elsewhere. Binaries for the Harris-Laplace and Laplacian of Gaussian interest region detectors are available for download here. You can find the Kadir-Brady entropy detector at Timor Kadir's website. Once the regions are extracted, you still need a way of describing the regions in a way that our model will understand. We use the Scale Invariant Feature Transform (SIFT) descriptor. With our parameter settings, each interest region ends up as a 128-dimension feature vector. The same package as above can be used to compute the SIFT feature vectors.## MCMC algorithm for Bayesian classificationWe have a C implementation of the Markov Chain Monte Carlo (MCMC) algorithm for simulating the posterior of the Bayesian kernel machine classifier given some training data. It was tested in Linux. Here is how to compile and install the code. First, you need to install a copy of the GNU Scientific Library
(GSL). Our code was testted with GSL version 1.6. The libraries should
be installed in the directory setenv
LD_LIBRARY_PATH $HOME/gsl/libNext, you're ready to install the ssmcmc package. "ssmcmc" stands
for "semi-supervised MCMC." As mentioned, you have to edit
the file
In order to train the model on some data, you need a few ingredients. First, you need some data in the proper format. We've made a sample training set available for download. In fact, this particular data set was used for many of our experiments. It was produced by extracting Harris-Laplace interest regions from the INRIA car data set (an average of 100 regions per image) then converting them to feature vectors using SIFT. The format of the data is as follows: - The first line gives the number of documents (images).
- The second line gives the dimension of the feature vectors.
- After that there's a line for each document (image) in the data set. Each line has two numbers. The first gives the image caption. It can either be 1 (all the points in the document are positive, which almost never happens in our data sets), 2 (all the points in the document are negative, which happens when there is no instance of the object in the image), or 0 (the points are unlabeled, which happens when there is an instance of the object in the image). The second number says how many points (extracted interest regions) there are in the document.
- The last part of the data file, and the biggest, is the data points themselves. There is one feature vector for each line. The first number is the true label --- this is only used for evaluation purposes and is not available to the model. The rest of the numbers are the entries that make up the feature vector.
Suppose you have your data set available. Next, you need to specify
the parameters for the model in a text file. A sample parameters file
looks like this: Put comment here. ns: 1000 metric: fdist2 kernel: kgaussian lambda: 0.01 mu: 0.01 nu: 0.01 a: 1.0 b: 50.0 mua: 0.01 nua: 0.01 epsilon: 0.1 nc1: 30 nc2: 0Most of the parameters above are explained in the journal paper submission and technical report. ns is the number of samples to
generate. There is only one possible distance metric and kernel, but
they must be specified anyway. lambda is the kernel scale
parameter and epsilon is the stabilization term on the
covariance prior.
The parameters m: 0.3 chi: 400 Once the model parameters are specified, you can finally train the
model with the following command:
ssmcmc -t=params -v carhartrain model
We're assuming here at that the parameters file is called params. The result is saved in the file model. Once
training is complete (it might take a little while), you can use the
model to predict the labels of interest regions extracted from any
image, including this sample test set,
with the following command:
ssmcmc -p=labels -b=100 -v carhartest model
This specifies a burn-in of 100, so the first hundred samples are discarded. The resulting predictions, along with the level of confidence in those predictions, is saved in the file labels. Each line in the file has two numbers: the probability
of a positive classification (the interest region belongs to the
object) and the probability of a negative classification (the interest
region is associated with the background). These two numbers should
add up to one. Here is a couple examples from that sample test set.The blue interest regions are more likely to belong to a car (greather than 0.5 chance). ## Conditional random field for localizationWe implemented the CRF model for localization in Matlab. The function
is called crflocalize. Type The function crflocalize requires the optimized Matlab implementation of the random schedule tree sampler, bgsfast. Once you have downloaded and unpacked the tar ball, follow these steps: - Make sure that you have the GNU Scientific Library installed and that LD_LIBRARY_PATH is set correctly (see the instructions above).
- Edit the Makefile. You want the variable
**GSLHOME**to point to the proper location. - Type
`make`in the code directory to compile the MEX files. To make sure that it is working, you can run the**testbgs**Matlab script. If you have problems compiling the C code, it may be because your MEX options are not set correctly. In particular, make sure you are using the g++ copmiler (or another C++ compiler). Refer to the Mathworks support website for details. Note that this code has only been tested in Matlab version 7.0.1. - Once you've managed to compile the bgsfast MEX program
successfully, use the
**addpath**function to include the**bgsfast**directory in your Matlab path.
Here are a couple of localization results for the cars database. ## NoteIf you have any questions or problems to report about this project, do not hesitate to contact the main author.## PublicationsHendrik Kück and Nando de Freitas. Learning to classify individuals based on group statistics. Conference on Uncertainty in Artificial Intelligence, July 2005. Peter Carbonetto, Gyuri Dorkò and Cordelia Schmid. Bayesian learning for weakly supervised object classification. Technical Report, INRIA Rhône-Alpes, July 2004. Hendrik Kück, Peter Carbonetto and Nando de Freitas. A Constrained semi-supervised learning approach to data association. European Conference on Computer Vision, May 2004. This webpage was last updated on August 13, 2005. Home. |