Homework 11

For the problem below, the expectation is that you submit a standalone HTML file (any images should be embedded) back to GitHub. Data for the problem is provided in the starting repository in the /data folder. Since this is the second homework in this unit, groups have already been created, so just join with your group when you accept the assignment.

Accept Assignment

Problem: Walking a Likelihood

In practice, we most often use MCMC methods to examine or estimate our parameter uncertainties from a model fit. In this problem then, we will apply Baye’s theorem alongside MCMC methods to determine the uncertainty of several model parameters. We will be using a well known model here: Planck’s law, and comparing the results we get from our previous curve_fit method to these new MCMC results.

Part A

Like in Homework 2, you are given a spectrum of a star to start this problem, with the wavelength values given in nanometers and the flux values in uncalibrated intensity measurements (no units). To get started, and to form a point of comparison, you want to fit Planck’s law to this data using curve_fit, extracting the parameters for emissivity and temperature. Importantly, curve_fit returns to you both the parameter list and the covariance matrix. This matrix, for the two parameter case, has 4 entries corresponding to: \left[\begin{matrix} \sigma_1^2 & \sigma_1 \sigma_2 \\ \sigma_2 \sigma_1 & \sigma_2^2 \end{matrix}\right] where \sigma is the standard deviation (uncertainty) of the subscripted parameter. The ordering of the parameters will follow whatever ordering you used when entering them into curve_fit. Of particular note here then are the diagonal terms, which describe the variance (standard deviation squared) of your model parameters. Report both the fit model parameters and their corresponding uncertainties.

Part B

Now we want to apply MCMC methods to see how our parameter variances might compare. As such, we need to define our prior, likelihood, and probability distribution function. Or rather, following convention on these, we should define the log version of these functions. Perhaps the most confusing of these would likely be the likelihood function, which depends on the associated error in the dependent variable (here, the flux). However, we don’t know the uncertainty in the fluxes here! So what are we to do?

The common method of dealing with this is to have MCMC actually predict the associated uncertainty in the flux along with the other parameters. So instead of just having 2 parameters that you’ll track (emissivity and temperature), you’ll have 3: emissivity, temperature, and flux error. Everything else will follow as we showed in class, you just need to keep in mind that you have 3 dimensions that you are working with. Note that because we are specifically walking over the flux uncertainty here now, we must keep the constant 2\pi\sigma^2 term in our likelihood. We can’t ignore it like it can in some circumstances (then there is a fixed flux error).

For defining the prior, reasonable starting points would be that the emissivity has to be between 0 and 1, that the temperature has to be greater than 0, and that the flux error has to be positive. The PDF will look essentially like it did in all our examples, since that just depends on the likelihood and priors.

For doing the actual sampling, your sampler could look very similar to the one you wrote for Problem 1, but with a few small tweaks:

You are wanting to work with log functions now, so you should change your if statement mathematics accordingly
Your parameters have very different orders of magnitude. So proposing a new value that varies equally in each direction will be non-ideal. Instead, you should define a covariance matrix for your random multivariate gaussian that scales differently in each direction. You only need to provide values on the diagonal, all the others will be 0. Keep in mind that the square root of these diagonal numbers is essentially the standard deviation that is used in that dimension when proposing a new value each time. Make these two tiny and your walker will barely move. Make them too large and it will be extremely sporadic.

Choose starting values for your walker very close to the parameter fits you got in Part A. For the flux error start around 2500. Run your walker for a number of iterations until the parameters have converged for a long while, just randomly bouncing around a singular value. The longer you run your walker, generally the “nicer” your final histograms will look (in Part C). Plot the values of each of your parameters over the number of iterations to show this convergence, and identify your “burn-in” point.

Warning

It is perhaps important to reiterate that MCMC is not meant to be an optimizer or fitting tool! Because the walkers tend to trend towards “higher” areas of the probability distribution, it can be used in this way, but it is woefully inefficient at doing so. Generally you want to start your MCMC methods at least close to your optimized solution, and then just let them wander around the vicinity of that peak solution, thereby exploring the parameter spread. This means likely using other methods to optimize the fit, and then resorting to MCMC for the uncertainty analysis. Different tools for different tasks.

Part C

Now you get to analyze your results! Slice out only the data from your walker past the “burn-in” point that you determined above. Use these data to create histograms of each of the parameters. Additionally, compute the median and standard deviation of each of these. Then reflect on each of the following:

How did your median values compare to the best fits that you got from curve_fit?
How did the computed standard deviations (or uncertainties) compare to what you got from curve_fit?
You also found a value for the flux error in this method, which was not something curve_fit could give or account for. Comparing this value to the original flux measurements and looking at the plot of flux vs wavelength, does this value seem reasonable?

Tip

While not required here, you can, and often would want, to create 2D histograms of each of the combinations of your parameters as well. If you were to do so here, the emissivity vs temperature histogram can be interesting, in that it will show a long and thin gaussian that is rotated at an angle. If you look back at the covariant matrix you got out of curve_fit, you might notice that the off diagonal terms are not 0. Those terms being non-zero implies that the gaussian is rotated, and thus would agree with what you see in the 2D histogram here.

Part D

Now that we know the uncertainty in some of our model parameters, the logical follow-up question can be to ask how sensitive our model was to these parameters. To do so, randomly pick 100 parameter combinations from your long list of burned-in parameters. Loop over them, computing your model results for each, and plot them on the same plot with a high degree of transparency. Include your original data on the plot as well for comparison. Comment about whether the variation in the parameter results is significantly affecting the model, and whether this influences your trust in the model results.