CS 6362 - Advanced Machine Learning




Assignment 4

In this assignment you will implement sparse, variational Gaussian processes for regression. Please go here for data, and starter code.

Data

The task for this assignment is to build a regression model for predicting housing prices.

The data you are provided consists of a set of features describing houses that are for sale. Namely, the attributes consist of:

The housing prices themselves are in units of 100,000.

A kernel for a house (20 points)

You should first setup a GP kernel for the housing data.

You may wish to use a squared-exponential kernel, or a dot-product kernel, depending on the attribute. You may also wish to exclude certain attributes that you do not think would be useful. You are free to make appropriate decisions as you see fit.

Critical to the kernel are hyperparameters. For this assignment, you will manually select hyperparameters. You will find that this is likely not easy. The next assignment will directly address this problem. But for now, you should settle on a set of hyperparameters. You may find it easier to select hyperparameters by inspecting the statistics of the individual fields, e.g. mean, standard deviation, min, max.

Sparse, Variational GP (60 points)

You will implement a sparse GP using the variational approximation, as detailed in Titsias and Hensman et al., and covered in class. This requires the following:

Analysis (20 points)

For a set of hyperparameters that you have decided on, you should run experiments to see the effect of the sparse GP. Namely, you should put together a written document that covers the following:

You may wish to report these results in tables, or optionally a heatmap. The choice is up to you, but your analysis should allow us to see the gains in the sparse approach, when we see diminishing returns, as well as the relationship between test error and the marginal likelihood.