CS 6362 - Advanced Machine Learning




Assignment 1

Show all of your work in support of answering the questions.

Q1 (10 points)

Given a random vector $\mathbf{x} \in \mathbb{R}^d$, distributed as a multivariate normal $\mathbf{x} \sim \mathcal{N}(\boldsymbol{\mu},\mathbf{\Sigma})$ with mean $\boldsymbol{\mu} \in \mathbb{R}^d$ and covariance $\mathbf{\Sigma} \in \mathbb{R}^{d \times d}$, if we apply a linear transformation $A \in \mathbb{R}^{d \times d}$ to the random vector, what is its resulting distribution?

Q2 (10 points)

“Complete the square” in Eq. 2.7 (Rasmussen & Williams), namely, derive the posterior.

Q3 (30 points)

Derive the predictive distribution in Eq. 2.9 (Rasmussen & Williams), namely, the mean and variance.

Q4 (20 points)

Derive the noise-free predictive distribution in Eq. 2.19 (Rasmussen & Williams).

Hint: write the conditional distribution in terms of the joint distribution and the prior, take the log, and use the matrix identities found in Appendix A.3 (Rasmussen & Williams). You may want to review the Schur complement.

Q5 (30 points)

Exercise 4 in Ch. 2 of (Rasmussen & Williams). Exercises can be found in Sec. 2.9 (end of chapter).

This result states that the variance in our predictions can not increase as we are given more data, which we should expect. Moreover, your derivation should result in an incremental formula for computing variance - this is really nice to have in certain situations.