Bayesian Inference – notjustrocketscience

Bayesian Inference is my current obsession, and I had been looking for ways to include it as part of my teaching. I didn’t have to search very long, since the Undergraduate Physics Lab is an excellent data collection source, where students spend endless hours swinging pendulums, producing steam in hot weather (think Thermal Physics labs in summer), making objects fall to determine the acceleration due to gravity, which they try to get as close as possible to the figure of 9.8 m/s^2……

Analyzing lab data, determination of errors, etc. has long been my Achilles heel. I come from a background in Theoretical Physics and that perhaps explains it to an extent. I have taken it up as a project to understand analysis of lab data and figure out ways to use Bayesian inference in determination of parameters of interest. This is a long-term project (given my sheer ignorance about all things practical), so this section will evolve quite a bit.

I have discovered an excellent set of lectures by Richard McElreath on YouTube (https://www.youtube.com/channel/UCNJK6_DZvcMqNSzQdEkzvzA), based on his book ‘Statistical Rethinking’. I encourage all students to check them out. I will try to extract information from the lectures useful to undergraduate Physics students and attempt to include it in analysis of experiments.

Sampling Probability Distributions
- Direct Sampling
- Markov Chain Sampling
Bayesian Linear Regression

Sampling Probability Distributions

Any honest attempt to implement Bayesian Inference involves sampling probability distributions, often over several variables. Following are two approaches: ‘Direct Sampling‘ and ‘Markov Chain Sampling‘. For distributions over one (or maybe two) variables, Direct Sampling works rather well. However, soon as we encounter distributions over several variables, the computation time increases exponentially with the number of variables. This is where Markov Chain Sampling works rather well.

Direct Sampling

The following notebook explains the technique of Direct Sampling and applies it to sample one and two-dimensional Gaussian distributions.

Markov Chain Sampling

The following note discusses the Metropolis-Hastings Algorithm for Markov Chain Sampling of a general probability distribution

Metropolis_Algorithm

The following notebook implements this to sampling a bi-variate Normal distribution

Bayesian Linear Regression

One place to start applying Bayesian Inference to the undergraduate lab is in the ‘Method of Least Squares’ that is used to fit data in the lab to a straight line (I am sure for good reason). This has appeared quite mysterious to me. On the one hand, it seems reasonable that there is a ‘best fitting line’ and that it should be such that it minimizes the ‘sum of squared deviations’. However, what about other lines? In other words, there is clearly a region which is credible, and consists of a continuum of lines all of which could credible ‘explain’ the data. How do we compute this region? Traditional Statistics textbooks give formulae to compute such regions, but traditional Statistics has always appeared rather mysterious to me. However, I do enjoy Probability Theory, and recently got interested in Bayesian Statistics. One of my short term goals is to develop a framework that analyzes lab data and ‘fits’ it in a Bayesian way, determining not just the best fit, but also credible regions. Another thing that I have often wondered about is why do we try to fit data to a straight line? It clearly arises from a Physical belief in a Linear Model, but what if we are not in the parameter space in which the Linear Model works? For instance, how do we know that oscillations of a pendulum are ‘small’ so as to assume a linear differential equation? Or, in a traditional spring-mass experiment, how do we know we are in the region where Hooke’s Law is valid? Could one frame this as a problem of ‘Bayesian Model Selection‘ ? What about noisy data arising out of a possible polynomial relationship (extending the linear model) where we do not know what the degree of the polynomial generating the data is. I suspect Bayesian methods could help us out.

Following is an attempt at using Bayesian Inference to estimate the slope and intercept of a line that is used to ‘fit’ a given data. The data is not from a lab, but data on weights and heights of individuals. The data has been taken from the excellent course on Bayesian Inference by Richard McElreath. We look at three distinct methods of Bayesian Inference in this context

Approximate Bayesian Inference

This is a half-hearted Bayesian approach, with the benefit that it gives us analytic expressions that can be used to estimate the slope and intercept. It is consistent with ‘standard’ statistical formulae, but (hopefully) demystifies these

The Grid Approximation

This is the most direct way to implement Bayesian Linear Regression. This involves describing the slope and intercept on a ‘grid’, and using direct sampling to sample the joint probability distribution

Markov Chain Sampling

The following uses Markov Chain Sampling, the preferred method of sampling probability distributions

Archives

Categories