Chaining

Jed Rembold

April 12, 2023

Announcements

Only 1 more homework assignment!
- The last homework assignment (HW5) will thus be due April 26
- I’m working to have it up by the end of the week
HW4 Partner Reflection! (Here!)
Quiz 2 on Exoplanets and Galaxies at end of class today
I’m still working through feedback unfortunately

Recap

When it comes to fitting models, the best fitting parameters may tell an incomplete story
Often, we also want to know the error or distribution of our parameters
- Methods like least-squares can do this, but only reliably for narrow use-cases
- We’d like other methods to get at similar information
Random or “Monte Carlo” methods can help deal with high-dimensional data
- Errors for this type of data otherwise increase exponentially for most numeric methods

Today

Fixing past mistakes
Sampling data using a simple Markov Chain
Applications to model fitting

Example 2

Now suppose we wanted to evaluate the area under the piece-wise function: \[ f(x) = \begin{cases}0 & x \leq 4;\\ 5x-20 & 4 < x \leq 5;\\ -5x+30 & 5 < x \leq 6;\\ 0 & x > 6;\end{cases} \] from \(x=0\) to \(x=10\), using Monte Carlo methods. Check into both uniform sampling and sampling targeted about the central peak.

Uniform sampling worked here, but our targeted sampling was struggling. What went wrong?

Issue #1

Make absolutely sure your probability distribution function (pdf) actually integrates to 1!
Previously, I had equated a triangle with height 0.5 to thus have slopes of 0.5, and -0.5
- This is, of course, not necessarily true
A valid pdf: \[ p(x) = \begin{cases} 0, & x <= 3 \\ 0.25x - 0.75, & 3 < x <= 5 \\ -0.25x + 1.75, & 5 < x <= 7 \\ 0, & x > 7 \end{cases} \]

Issue #2

Trying to select from a continuous probability distribution discretely is not ideal
- But it is what is immediately available though numpy

For a better option, roll your own selector:

def sample_from_prob_dist(pdf, lower, upper, num):
    samples = []
    while len(samples) < num:
        x = np.random.uniform(lower, upper)
        p = pdf(x)
        if np.random.uniform(0,1) <= p:
            samples.append(x)
    return np.array(samples)

Returning to Example 2

Now suppose we wanted to evaluate the area under the piece-wise function: \[ f(x) = \begin{cases}0 & x \leq 4;\\ 5x-20 & 4 < x \leq 5;\\ -5x+30 & 5 < x \leq 6;\\ 0 & x > 6;\end{cases} \] from \(x=0\) to \(x=10\), using Monte Carlo methods. Sample from the probability distribution defined as: \[ p(x) = \begin{cases} 0, & x <= 3 \\ 0.25x - 0.75, & 3 < x <= 5 \\ -0.25x + 1.75, & 5 < x <= 7 \\ 0, & x > 7 \end{cases} \]

An Unknown PDF

What if we don’t know or have a PDF to work with?
The perfect PDF would look just like the function we are trying to integrate, but scaled to have area = 1
- But if we could easily calculate the area, we wouldn’t be doing this integration in the first place
Instead we will construct a Markov Chain to semi-randomly “walk” it’s way around our initial function
- Goal is to have it spend more of its time near “higher” parts of the function
- Then we’ll look at the distribution of where it spent time

Markov Chains

A Markov chain is a model that represents a sequence of events where the probability of the next event is determined only by the current event
In our case, the point we are currently “at” will make it likely that our next point is nearby
Our process will then follow that of Metropolis-Hastings and is shown below. Note that you must keep track of the visited points along the way!
- Choose a starting point
- For some number of iterations:
  - Randomly choose a nearby point
  - Compare that point to your current point, to see if it would move you “higher” up the function
  - If above a random threshold, move to that point, otherwise stay where you are at

Choosing a Nearby Point

Randomly choosing a nearby point inherently pulls from a probability distribution centered on the current point
Most frequently used is a Gaussian probability distribution
- Centered (mean) on the current point
- Spread (std) can be tuned, but 1 is a reasonable starting value
Working in higher dimensions? A multi-variate Gaussian will still work!
```
np.random.multivariate_normal(mean=VEC, cov=MAT)
```

Move onward or stay?

Once you have the new point, you need to decide whether to keep it or your current point
In Metropolis-Hastings, you:
- Take the ratio of your sampled function evaluated at the new point over the sampled function evaluated at the current \[ \frac{f(x^\prime)}{f(x)} \] where \(x\) is your current position and \(x^\prime\) your proposed new position
- Compare that ratio of a randomly generated number between 0 and 1
  - If bigger, keep the new position, making it your new current
  - If smaller, keep the old position (but still add it to your visited list again)

Backing out a PDF

The history of the random walk contains information about where the walker spent the most time
- Due to how we constructed the walk, this should be near the maximum portions of our function
Creating a histogram or KDE plot will allow us to visualize the probability distribution.
We could use the histogram information to directly sample from the distribution

Quiz Time!

The remaining time is set aside for Quiz 2