Homework 10

For the problem below, the expectation is that you submit a standalone HTML file (any images should be embedded) back to GitHub. There is no provided data for this problem. Since this is the first assignment in a new unit, you’ll need to form your new groups when you accept the assignment. The first person to accept the assignment should form an easily identifiable group name (likely using the names of your group members). Others in the group should also accept the assignment, and then they can just join the existing group.

Accept Assignment


Problem: Walking a Function

In this problem you will construct and sample from a given function using a Monte Carlo Markov Chain. While eventually you will be free to use an MCMC sampler from a provided library, for this assignment you should plan to write your own, as we demonstrated in class. The function you will be sampling from is the rotated 2D-Gaussian given by: f(x,y) = \exp\left(-0.145(x-50)^2 + 0.21(x-50)(y-10) - 0.145(y-10)^2\right)

Important

Note that there is no model fitting or anything similar happening here. We are using MCMC purely as a method to “sample” this specific function.

Part A

Visualize this function in some fashion so that you have an idea of what you will expect the output of your sampling to look similar to. There are a variety of ways you can visualize a function with two independent variables like this, which would include but are not limited to: a heatmap, a contour plot, or a 3D surface plot. If it helps you “center” your visual, most of the interesting stuff here is happening near the point (50,10).

Part B

Write a MCMC sampler function using the Metropolis-Hastings algorithm which will sample from the desired function. In this case it makes sense to take steps with \sigma_x=\sigma_y=1. Set your starting point as the origin (0,0), and run the sampler for 5000 steps. Visualize the path that the walker took across those 5000 steps.

Part C

Look at the trend in your sampled parameters (x and y) over the iterations to determine at what point your walker had stabilized near the area of maximum f. This would be your burn-in point, so select only those iterations that came afterwards to work with going forwards. Make a 2D-histogram or KDE plot of these sampled parameters, and determine a method to visually compare these sampled counts back to your original function f. How well do the distributions match? Is one a scalar multiple of the other? (Looking at a 1d “slice” through the 2d distribution might make it easier to compare things here.)