---
title: "Chaining"
author: Jed Rembold
date: April 12, 2023
slideNumber: true
theme: tokyo-night-light
highlightjs-theme: tokyo-night-light
width: 1920
height: 1080
transition: fade
p5js: true
---


## Announcements
- Only 1 more homework assignment!
  - The last homework assignment (HW5) will thus be due April 26
  - I'm working to have it up by the end of the week
- HW4 Partner Reflection! ([Here!](https://docs.google.com/forms/d/e/1FAIpQLSfvLl2EobtSOSnv2SBEeqNQvc7IklGVXmi5__iK3DQIrexx8w/viewform?usp=sf_link))
- Quiz 2 on Exoplanets and Galaxies at end of class today
- I'm still working through feedback unfortunately


## Recap
- When it comes to fitting models, the best fitting parameters may tell an incomplete story
- Often, we also want to know the error or distribution of our parameters
  - Methods like least-squares can do this, but only reliably for narrow use-cases
  - We'd like other methods to get at similar information
- Random or "Monte Carlo" methods can help deal with high-dimensional data
  - Errors for this type of data otherwise increase exponentially for most numeric methods


## Today
- Fixing past mistakes
- Sampling data using a simple Markov Chain
- Applications to model fitting


## Example 2
:::{.block name=Previously...}
Now suppose we wanted to evaluate the area under the piece-wise function:
$$ f(x) = \begin{cases}0 & x \leq 4;\\ 5x-20 & 4 < x \leq 5;\\ -5x+30 & 5 < x \leq 6;\\ 0 & x > 6;\end{cases} $$
from $x=0$ to $x=10$, using Monte Carlo methods. Check into both uniform sampling and sampling targeted about the central peak.
:::
- Uniform sampling worked here, but our targeted sampling was struggling. What went wrong?


## Issue #1
- **Make absolutely sure your probability distribution function (pdf) actually integrates to 1!**
- Previously, I had equated a triangle with height 0.5 to thus have slopes of 0.5, and -0.5
  - This is, of course, not necessarily true
- A valid pdf:
  $$
  p(x) = \begin{cases}
            0, & x <= 3 \\
            0.25x - 0.75, & 3 < x <= 5 \\
            -0.25x + 1.75, & 5 < x <= 7 \\
            0, & x > 7
          \end{cases}
  $$

## Issue #2
- Trying to select from a continuous probability distribution discretely is not ideal
  - But it is what is immediately available though numpy
- For a better option, roll your own selector:
  ```{.python style='line-height:1.2em'}
  def sample_from_prob_dist(pdf, lower, upper, num):
      samples = []
      while len(samples) < num:
          x = np.random.uniform(lower, upper)
          p = pdf(x)
          if np.random.uniform(0,1) <= p:
              samples.append(x)
      return np.array(samples)
  ```

## Returning to Example 2
Now suppose we wanted to evaluate the area under the piece-wise function:
$$ f(x) = \begin{cases}0 & x \leq 4;\\ 5x-20 & 4 < x \leq 5;\\ -5x+30 & 5 < x \leq 6;\\ 0 & x > 6;\end{cases} $$
from $x=0$ to $x=10$, using Monte Carlo methods. Sample from the probability distribution defined as:
  $$
  p(x) = \begin{cases}
            0, & x <= 3 \\
            0.25x - 0.75, & 3 < x <= 5 \\
            -0.25x + 1.75, & 5 < x <= 7 \\
            0, & x > 7
          \end{cases}
  $$

## An Unknown PDF
- What if we don't know or have a PDF to work with?
- The _perfect_ PDF would look just like the function we are trying to integrate, but scaled to have area = 1
  - But if we could easily calculate the area, we wouldn't be doing this integration in the first place
- Instead we will construct a Markov Chain to semi-randomly "walk" it's way around our initial function
  - Goal is to have it spend more of its time near "higher" parts of the function
  - Then we'll look at the distribution of where it spent time


## Markov Chains
- A Markov chain is a model that represents a sequence of events where the probability of the next event is determined only by the current event
- In our case, the point we are currently "at" will make it likely that our next point is nearby
- Our process will then follow that of Metropolis-Hastings and is shown below. Note that you must keep track of the visited points along the way!
  - Choose a starting point
  - For some number of iterations:
    - Randomly choose a nearby point
    - Compare that point to your current point, to see if it would move you "higher" up the function
    - If above a random threshold, move to that point, otherwise stay where you are at


## Choosing a Nearby Point
- Randomly choosing a nearby point inherently pulls from a probability distribution centered on the current point
- Most frequently used is a Gaussian probability distribution
  - Centered (mean) on the current point
  - Spread (std) can be tuned, but 1 is a reasonable starting value
- Working in higher dimensions? A multi-variate Gaussian will still work!
  ```python
  np.random.multivariate_normal(mean=VEC, cov=MAT)
  ```

## Move onward or stay?
- Once you have the new point, you need to decide whether to keep it or your current point
- In Metropolis-Hastings, you:
  - Take the ratio of your sampled function evaluated at the new point over the sampled function evaluated at the current
    $$ \frac{f(x^\prime)}{f(x)} $$
    where $x$ is your current position and $x^\prime$ your proposed new position
  - Compare that ratio of a randomly generated number between 0 and 1
    - If bigger, keep the new position, making it your new current
    - If smaller, keep the old position (but still add it to your visited list again)


## Backing out a PDF
- The history of the random walk contains information about where the walker spent the most time
  - Due to how we constructed the walk, this **should** be near the maximum portions of our function
- Creating a histogram or KDE plot will allow us to visualize the probability distribution.
- We could use the histogram information to directly sample from the distribution


## Quiz Time!
- The remaining time is set aside for Quiz 2
