Chaining

Jed Rembold

April 12, 2023

Announcements

  • Only 1 more homework assignment!
    • The last homework assignment (HW5) will thus be due April 26
    • I’m working to have it up by the end of the week
  • HW4 Partner Reflection! (Here!)
  • Quiz 2 on Exoplanets and Galaxies at end of class today
  • I’m still working through feedback unfortunately

Recap

  • When it comes to fitting models, the best fitting parameters may tell an incomplete story
  • Often, we also want to know the error or distribution of our parameters
    • Methods like least-squares can do this, but only reliably for narrow use-cases
    • We’d like other methods to get at similar information
  • Random or “Monte Carlo” methods can help deal with high-dimensional data
    • Errors for this type of data otherwise increase exponentially for most numeric methods

Today

  • Fixing past mistakes
  • Sampling data using a simple Markov Chain
  • Applications to model fitting

Example 2

Now suppose we wanted to evaluate the area under the piece-wise function: \[ f(x) = \begin{cases}0 & x \leq 4;\\ 5x-20 & 4 < x \leq 5;\\ -5x+30 & 5 < x \leq 6;\\ 0 & x > 6;\end{cases} \] from \(x=0\) to \(x=10\), using Monte Carlo methods. Check into both uniform sampling and sampling targeted about the central peak.

  • Uniform sampling worked here, but our targeted sampling was struggling. What went wrong?

Issue #1

  • Make absolutely sure your probability distribution function (pdf) actually integrates to 1!
  • Previously, I had equated a triangle with height 0.5 to thus have slopes of 0.5, and -0.5
    • This is, of course, not necessarily true
  • A valid pdf: \[ p(x) = \begin{cases} 0, & x <= 3 \\ 0.25x - 0.75, & 3 < x <= 5 \\ -0.25x + 1.75, & 5 < x <= 7 \\ 0, & x > 7 \end{cases} \]

Issue #2

  • Trying to select from a continuous probability distribution discretely is not ideal

    • But it is what is immediately available though numpy
  • For a better option, roll your own selector:

    def sample_from_prob_dist(pdf, lower, upper, num):
        samples = []
        while len(samples) < num:
            x = np.random.uniform(lower, upper)
            p = pdf(x)
            if np.random.uniform(0,1) <= p:
                samples.append(x)
        return np.array(samples)

Returning to Example 2

Now suppose we wanted to evaluate the area under the piece-wise function: \[ f(x) = \begin{cases}0 & x \leq 4;\\ 5x-20 & 4 < x \leq 5;\\ -5x+30 & 5 < x \leq 6;\\ 0 & x > 6;\end{cases} \] from \(x=0\) to \(x=10\), using Monte Carlo methods. Sample from the probability distribution defined as: \[ p(x) = \begin{cases} 0, & x <= 3 \\ 0.25x - 0.75, & 3 < x <= 5 \\ -0.25x + 1.75, & 5 < x <= 7 \\ 0, & x > 7 \end{cases} \]

An Unknown PDF

  • What if we don’t know or have a PDF to work with?
  • The perfect PDF would look just like the function we are trying to integrate, but scaled to have area = 1
    • But if we could easily calculate the area, we wouldn’t be doing this integration in the first place
  • Instead we will construct a Markov Chain to semi-randomly “walk” it’s way around our initial function
    • Goal is to have it spend more of its time near “higher” parts of the function
    • Then we’ll look at the distribution of where it spent time

Markov Chains

  • A Markov chain is a model that represents a sequence of events where the probability of the next event is determined only by the current event
  • In our case, the point we are currently “at” will make it likely that our next point is nearby
  • Our process will then follow that of Metropolis-Hastings and is shown below. Note that you must keep track of the visited points along the way!
    • Choose a starting point
    • For some number of iterations:
      • Randomly choose a nearby point
      • Compare that point to your current point, to see if it would move you “higher” up the function
      • If above a random threshold, move to that point, otherwise stay where you are at

Choosing a Nearby Point

  • Randomly choosing a nearby point inherently pulls from a probability distribution centered on the current point

  • Most frequently used is a Gaussian probability distribution

    • Centered (mean) on the current point
    • Spread (std) can be tuned, but 1 is a reasonable starting value
  • Working in higher dimensions? A multi-variate Gaussian will still work!

    np.random.multivariate_normal(mean=VEC, cov=MAT)

Move onward or stay?

  • Once you have the new point, you need to decide whether to keep it or your current point
  • In Metropolis-Hastings, you:
    • Take the ratio of your sampled function evaluated at the new point over the sampled function evaluated at the current \[ \frac{f(x^\prime)}{f(x)} \] where \(x\) is your current position and \(x^\prime\) your proposed new position
    • Compare that ratio of a randomly generated number between 0 and 1
      • If bigger, keep the new position, making it your new current
      • If smaller, keep the old position (but still add it to your visited list again)

Backing out a PDF

  • The history of the random walk contains information about where the walker spent the most time
    • Due to how we constructed the walk, this should be near the maximum portions of our function
  • Creating a histogram or KDE plot will allow us to visualize the probability distribution.
  • We could use the histogram information to directly sample from the distribution

Quiz Time!

  • The remaining time is set aside for Quiz 2
// reveal.js plugins // Added plugins ,