---
title: "Gravity and Distributions"
author: Jed Rembold
date: January 23, 2025
slideNumber: true
theme: tokyo-night-light
highlightjs-theme: tokyo-night-light
width: 1920
height: 1080
transition: slide
---


## Announcements
- Homework 1
  - Should have everything you need to do all of P1 and potentially start P2
  - Second check-in will go out this weekend! Don't forget to fill it out!
- Just a few of you still need to fill out the information gathering poll [here](https://docs.google.com/forms/d/e/1FAIpQLSdaUjeiwdLBnon0uamSm0lKLgxC88S24TsB70UcPs0_8yrzFQ/viewform?usp=sharing)


## Recap
- Kepler found that all orbiting objects move in an ellipse (Kepler's 1st Law)
  - The central object exists at one of the focus points
- Ellipse and orbit vocabulary
- Using the `fitellipse` libraries

## Today's Plan
- Kepler continued
	- Laws 2 and 3
- Gravity
- Distributions
	- Histograms/Density plots
	- 2D variants

<!--
## More Vocab
- _Apogee/Aphelion_: The point at which a body is at its furthest from the Earth/Sun
- _Perigee/Perihelion_: The point at which a body is at its closest to the Earth/Sun

\begin{tikzpicture}%%width=70%
[every node/.style={font=\sf, color=Black}]
  \coordinate (sun) at (1.65,0);
  \draw[very thick] (0,0) ellipse (3cm and 2.5cm);
  \fill[inner color=yellow, outer color=orange] (sun) circle (2mm);
  \node[circle, fill, inner sep=2pt, label={right: Perihelion}](p) at (3,0) {};
  \node[circle, fill, inner sep=2pt, label={left: Aphelion}](a) at (-3,0) {};
  \draw[stealth-stealth, thick, Red] (p) -- (sun);
  \draw[stealth-stealth, thick, Red] (a) -- (sun);
\end{tikzpicture}



## Pesky Ellipses
- Because the elliptical orbits that space objects follow are tied directly to some physical properties, it can be very useful to determine elliptical parameters
- The classic equation of an ellipse:
  $$ \frac{(x-x_c)^2}{a^2} + \frac{(y-y_c)^2}{b^2} = 1 $$
  allows for no rotations
- The general polynomial form:
  $$ Ax^2 + Bxy + Cy^2 + Dx + Ey + F = 0 $$
  doesn't give obvious expressions for the desired parameters


## Fitting Tactics
- There are a variety of approaches, most perform best when the sampled points come from around the entire ellipse
	- Different approaches can error in different ways if all points are from one side of the ellipse
- All the pre-built options required some odd libraries and worked in very different ways between Python and R
- So I found starting code for both and then customized to create fitting options that work very similarly in both
	- In Python: [fitellipse.py](../libs/fitellipse.py)
	- In R: [fitellipse.r](../libs/fitellipse.r)


## Library Explanation
- Both libraries introduce 3 functions. To get access to them, run or import the library
	- In Python: `run fitellipse.py` or `from fitellipse import *`
	- In R: `source('fitellipse.r')`
- 3 functions:
	- `fit_ellipse(x, y)`: Computes the best fit ellipse and returns both the coefficients and the common fit parameters as an associative array. Takes points to fit as lists of x and y coordinates.
	- `get_ellipse(fit_params)`: Computes a series of x and y points representing the given ellipse, where `fit_params` is the output of `fit_ellipse`. Useful for plotting purposes
	- `create_test_ellipse(Rx,Ry,Cx,Cy,Rotation,NoiseLevel)`: creates a series of x and y coordinates representing an ellipse with the given parameters


## Caveats
- Both fitting functions seem to do excellent at determining the center of the ellipse and major and minor axes
- There is some variability with the resulting angle, where it sometimes comes out $90^\circ$ off ($\pi/2$ technically since the angle is in radians)
	- I believe this stems from discrepancies with whether it ends up finding the angle from the major or minor axis, but I spent hours last night trying to make more robust and failed
	- Fortunately, the angle of the ellipse has absolutely no bearing on any physical laws, and is only of interest for plotting purposes
		- If the angle of the ellipse is wrong when plotting, add or subtract $\pi/2$ from it


## Example with Comet NeoWISE
- Comet NeoWISE made an appearance several years back
	- That particular comet had a huge eccentricity and thus couldn't be plotted
- A different NeoWISE comet though has its positions over the next 100 years stored [here](../demos/neowise_positions.csv)
- How could I estimate the eccentricity of this comet?


## Activity!
- I have given you [here](../demos/pluto_positions.csv) the coordinates of Pluto over a 300 year span
- Your task: determining the closest approach Pluto makes with the Sun. That is to say, what is its perihelion distance?
	- Doing so will require working out the major axis and determining where the Sun would be
	- A picture might help you keep things straight as you go
- **Note**: If you compare this value to the known value, it isn't going to be perfect, owing to us not fully accounting for the tilt of Pluto's orbit when computing the ellipse
-->

# Return of Kepler
## Kepler's 2nd Law
- A line joining a planet to the Sun sweeps out equal areas in equal times
	- Results in planets moving faster when close to the Sun and slower when far away

\begin{tikzpicture}%%width=60%
[every node/.style={font=\sf, color=Black}]
  \coordinate (sun) at (1.65,0);
  \draw[very thick] (0,0) ellipse (3cm and 2.5cm);
  \fill[opacity=0.5,Red] (sun) -- (-25:3cm and 2.5cm) arc (-25:25:3cm and 2.5cm);
  \fill[opacity=0.5,Red] (sun) -- (172:3cm and 2.5cm) arc (172:188:3cm and 2.5cm);
  \fill[inner color=yellow, outer color=orange] (sun) circle (2mm);
  \node[right] at (3,0) {Fastest};
  \node[left] at (-3,0) {Slowest};
\end{tikzpicture}

## Kepler's 3rd Law
:::incremental
- The square of a planet's orbital period is proportional to the cube of its semi-major axis.
	- A planet's _period_ is the time it takes to complete an entire orbit
	- An orbit's semi-major axis we defined earlier to be half the major axis
- Mathematically, this looks like:
  $$ p^2 \propto a^3 \qquad\text{or}\qquad \frac{p^2}{a^3} = \text{constant} $$
  where $p$ is the period and $a$ is the semi-major axis
- For objects orbiting the Sun, if you choose units of years and AU, then
  $$ p^2 \approx a^3 $$
  - 1 _AU_ = 1 astronomic unit = average distance from Earth to Sun ($1.496\times10^{11}$ meter)
:::


## Projections
- When observing, you rarely get a perfectly top-down view of an ellipse
- Often, you are looking at some view projected from the side
	- This does not change the measurement of the period!
	- It may make measuring the semi-major axis more difficult though
- Since many planets (and moons) have _mostly_ circular orbits, you can estimate the major axis by just taking the largest end-to-end distance that you observe


# Newton's Gravity
## Why Gravity?
:::::cols
::::col
- Kepler told us that there **must** be these relationships, but he couldn't say _why_
- Newton found that, if the direction that something is moving changes, then it **must** have experienced some force
	- Newton's big connection (one of them) was determining that the necessary force turns out to be from gravity (at least for most astronomic objects)
::::
::::col
\begin{tikzpicture}%%width=100%
[scale=0.7]
  \fill[inner color=yellow, outer color=orange] (0,0) circle (1cm);
  \draw[dashed, -latex, thick] (3,0) arc (0:340:3cm);
	\foreach \a in {0,45,...,350}{
	  \draw[orange, -latex] (\a:2.7) -- (\a:1.3);
	}
  \node at (3,0) {\includegraphics[width=1cm]{world.png}};
\end{tikzpicture}

::::
:::::

## Gravity
- Gravity is the universal attractor
	- Anything with mass attracts anything else with mass
	- Strength of force increases with the amount of mass involved
	- Strength of force decreases rapidly with distance between the masses

\begin{tikzpicture}%%width=70%
  \coordinate (ball1) at (0,0);
  \coordinate (ball2) at (8,0);
  \coordinate (d) at ($(ball1)-(0,1.5)$);
  \draw[Green] (ball1) --+(0,-2);
  \draw[Cyan] (ball2) --+(0,-2);
  \fill[ball color=Green!50] (ball1) circle (1cm);
  \fill[ball color=Cyan!50] (ball2) circle (1cm);
  \node at ($(ball1)+(0,1.2)$) {$M_1$};
  \node at ($(ball2)+(0,1.2)$) {$M_2$};
  \draw[<->, thick] (d) -- node[midway,below] {d} (d-|ball2);
  \draw[-latex, ultra thick,Red] (ball1) -- +(4,0);
  \draw[-latex, ultra thick,Red] (ball2) -- +(-4,0);
  \node[Red] at (4,1) {\scalebox{1.5}{$F_g = G \frac{M_1 M_2}{d^2}$}};
\end{tikzpicture}


## Newton meets Kepler's 3rd
:::{style='font-size:.9em'}
- Kepler already had worked out
  $$ \frac{a^3}{p^2} = \text{same value for all planets orbiting Sun} $$
- Newton worked out, starting with the force, that two objects held in orbit by gravity would obey:
  $$ \frac{a^3}{p^2} = \frac{G(M_1 + M_2)}{4\pi^2} $$
  where:
  - $M_1, M_2$ are the masses of the objects in _kilograms_
  - $a$ is the average separation between the objects in _meters_
  - $p$ is the orbital period in _seconds_
  - $G$ is the gravitational constant ($6.67\times10^{-11}\,\tfrac{m^3}{kg\,s^2}$)
:::

## Some nicer units
- Put in more convenient units, Newton's formulation of Kepler's 3rd boils down to:
  $$ \frac{a^3}{p^2} = (M_1 + M_2)_\odot $$
  where
  - $M_1, M_2$ are the masses of objects in _solar masses_ (multiples of the Sun's mass)
  - $a$ is the average separation of the objects in _AU_
  - $p$ is the orbital period in _years_
- For the Sun and most planets, $M_1 + M_2 \approx 1 M_\odot$
- If you can measure $a$ and $p$, then you can work out the mass of the objects!


# Visualizing Distributions
## Distributions
:::::cols
::::col
- There is a **lot** of stuff in space!
- To understand bulk properties of objects, we commonly need to resort to looking at various _distributions_
- Distributions can be comprised by looking at almost any variable of interest (or in some cases multiple variables)
::::
::::col
![](../images/distribution.png){width=100%}
::::
:::::

## Common usages:
- EDA: exploring and bringing to light interesting properties or trends in the gathered data
- By fitting mathematical expressions to the distribution, you now have a tool to use in further analysis
	- This is particularly common in simulations, where simulations pull from known distributions to create physically realistic situations
	- What type of mathematical distribution fits best can potentially tell you about possible inner workings that yielded that distribution
- Classification: sometimes there are several distinct groupings in a distribution that can lead to various types of classification


## Histograms
- A _histogram_ is a visual indication of how many times a certain variable appears
- Y-axis represents counts, and the x-axis is divided into a number of bins
- Data points that have a value appearing within a particular bin contribute to that bin's count
- Commonly will want to try several binning methods or sizes

## Generating Histograms
::::::{.cols style='align-items: flex-start'}
::::col
:::{.block name=Python}
- Matplotlib can generate plots directly:
  
  ```python
  plt.hist( variable_list, 
            bins=num_bins )
  plt.show()
  ```
:::

::::

::::col
:::{.block name=R}
- Just use the `hist` function:
  
  ```R
  hist( variable_list, 
        breaks=num_bins )
  ```
- Or, in ggplot

  ```R
  ggplot(data=df, 
         aes(x=variable)
        ) +
    geom_histogram()
  ```
:::

::::
::::::


## When Bins fail
- The choice of bins can often lead to distributions that might feel pretty subjective
- One option is to use one of the auto-binning algorithms provided by your histogram software (read documentation)
- Another, which especially makes sense for continuous data, is to look at a Kernel Density Estimate, or KDE plot
	- Most common kernel is that of a Gaussian, but can be different
	- Essentially stacks multiple Gaussians for each point, and then sums them together
	- Results in a "smoother" looking histogram


## Generating KDE Plots
::::::{.cols style='align-items: flex-start'}
::::col
:::{.block name=Python}
- Pandas can generate plots directly:
  
  ```python
  df[column name].plot(kind='kde')
  ```
  - Does require `scipy` under the hood, so will need it installed
- Can also install Seaborn and use its variants of you like

  ```python
  import seaborn as sbn
  sbn.kdeplot(df[column name])
  ```
:::
::::

::::col
:::{.block name=R}
- Just use the `plot` and `density` functions:
  
  ```R
  plot(density(variable_list))
  ```
- In ggplot:
  
  ```R
  ggplot(data=df, aes(x=column)) +
    geom_density(bw='bcv')
  ```
:::

::::
::::::

## Activity!
- [Here](../demos/satellites.csv) is a collection of satellite information for satellites orbiting the Earth
	- It has a lot of (some nonsense) columns
	- You will be interested in the column named `'Apogee (km)'`
	- This table originally included some satellites that are **really** distant from the Earth, but I've already filtered it a bit down to our region of interest
- Create both histogram and KDE plots of the distribution of satellite apogee distances
	- Can you see the outer cluster formed from geosynchronous satellites?


## 2D Histograms and KDE Plots
::::::cols
::::col
- Sometimes you have multiple variables that you want to visualize together as a distribution
- There are 2D analogs of both histograms and KDE plots
  - One variable along each axis
  - Counts or density still determine color
::::

::::col

![](../images/distribution2d.png)

::::
::::::

## Multivariate Distribution Creation
::::::{.cols style='align-items: flex-start'}
::::col
:::{.block name=Python}
- Histogram in Matplotlib
  
  ```python
  plt.hist2d(xs, ys, bins=20)
  ```
- KDE plot easiest through Seaborn

  ```python
  sbn.kdeplot(x=xs, 
              y=ys, 
              fill=True)
  ```
:::
::::

::::col
:::{.block name=R}
- Histogram through ggplot

  ```python
  ggplot(data=df, 
         aes(x=xs, y=xs)
        ) + 
    geom_bin_2d()
  ```
- KDE plot through ggplot
  ```R
  ggplot(data=df, 
         aes(x=xs, y=xs)
        ) + 
    geom_density_2d_filled()
  ```
:::

::::
::::::

