Skip to content

Hamiltonian Monte Carlo implemented for simple linear systems

License

Notifications You must be signed in to change notification settings

larsgeb/hmc-linear-systems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hamiltonian Monte Carlo Probabilistic inversion of linear models

This program is built to perform Markov Chains using Hamiltonian Mechanics. Essentially, any amount of parameters and any linear forward model matrix can be supplied, as long as one can describe the system in quadratic form;

equation (1)

Typically, the matrix A has to be positive definite, for a posterior probabilty to be well defined in model space. Typically, this quadratic form is the result of using Bayes' Theorem in conjuction with a linear model and some prior information. More is described in the reader.

The latest implementation (on the 'ABC' branch) uses the Armadillo library (Sanderson and Curtin, 2016) to do multithread linear algebra operations. This code has been succesfully tested for models up to 10201 parameters. A detailed documentation is as of yet not complete, but for a general idea how to set up an inversion I do recommend looking in the subfolder inversions.

Sample results

Autocorrelation of a single HMC chain:

Autocorrelation

Correlation matrix of particular inversion result:

Correlation

Boxplot of parameters on symlog scale:

Boxplot

Boxplot of parameters on linear scale:

Boxplot-true

Marginal of posterior solution:

Marginal

Hamiltonian Monte Carlo

There are a lot of good texts on Hamiltonian Monte Carlo. My favourites include Radform Neal's chapter in the Handbook of Markov Chain Monte Carlo (by Brooks et al.) for the basics of constructing a Markov Chain using HMC and Albert Tarantola's book on Inverse Problem Theory and methods for Model Parameter Estimation for the basics of probabilistic inversion and an introduction to Markov Chain appraisals of posteriors. The references are given below.

A great visualization of the algorithm is available at Alex Rogozhnikov's GitHub hosted blog.

A theoretical introduction will soon be present.

Implementation specifics

Current work

The effectiveness of the method depends mostly on the choice of mass matrix. It turns out that the optimal mass matrix is the systems Hessian. For a linear problem, solving the Hessian of the Bayesian formulation is the same as obtaining the full solution. It is therefore quite backwards to solve the linear system by computing the Hessian, after which we can use this as a mass matrix to compute the solution.

An alternative is having an estimate of the diagonal entries of the Hessian; the variances of each model parameter. These are relatively cheap to compute if the prior and data covariances are diagonal. This is currently being implemented.

Despite all this, this software does provide an easy way to estimate posterior covariances and cross-correlations. The algorithms still works on sytems which are not positive definite, and the results of the Hamiltonian Monte Carlo algorithm might give some insight as to which parameters are problematic (testing a null-space for example).

Prerequisites

Make sure that BLAS and LaPACK are available on your system, as well as the Armadillo library.

A full installation guide is provided below.

Downloading and preparing

To get started there's basically two things needed on a standard Linux PC; Armadillo and the project source files. We also need BLAS and LaPACK for speed but usually these packages are available. If not, there's a ton of resources online, but I also don't want to be the one who constantly points to Google: if you're having trouble, contact me.

So, to start doing some cool Markov Chains what I suggest you do is create a working folder on your machine, name it anything in any location. To this folder we're going to do 2 git checkouts. If you obtained the project in any other way (e.g. usb) you probably end up with the same folders.

First you should git checkout Armadillo; conradsnicta/armadillo. The repo can be found here.

Alternatively you can use your package manager to install it (like apt):

sudo apt install libarmadillo-dev libarmadillo7 

This requires you to modify the CMakeLists.txt slightly. It is shown in the folder alt/.

Subsequently, you should check out this repository (specifically the ABC or master branch). In the end, the two git repo's should end up next to each other (if you installed Armadillo using git, otherwise it's just one folder):

- main_project_folder/
| - armadillo-code/
| - hmc-linear-systems/

Compilation

Armadillo:

First we need to compile Armadillo for your system. We're going to use a simple linking method which doesn't require placing Armadillo in any path. (This differs a little if you installed it using a package manager, you don't have to compile Armadillo. Again, check out the folder alt.)

Go to the armadillo-code/ folder and perform the following commands (the output is interesting as it will tell you something about the optimization):

cmake .
make armadillo

If that goes uneventful, Armadillo is compiled!

hmc-linear-systems:

This project works really neatly in CLion or other CMake enabled IDEs.

All build targets can easily be prepared with CMake. Using at least CMake 3.7 allows you to run (from the project root directory);

cmake . -DCMAKE_BUILD_TYPE=Release

This automatically prepares optimizations such as 'O3' and should typically be used. If you're having problems try:

cmake . -DCMAKE_BUILD_TYPE=Debug

and debug from your choice of debugger.

This prepares the platform dependent compilation instructions. For example, on Linux one would now run for all separate targets (as of yet only one target exists):

make hmc_sampler

Execution

All targets are compiled into bin/. Running the bash scripts should be done from this folder as such:

  ../inversions/inversion1.bash

This creates a file containing the samples by row, augmented with misfit value as such;

{parameter_1,sample_1} {parameter_2,sample_1} ... {misfit,sample_1}
{parameter_1,sample_2} {parameter_2,sample_2} ... {misfit,sample_2}
{parameter_1,sample_3} {parameter_2,sample_3} ... {misfit,sample_3}

These values are floats and space separated. They can be easily loaded into e.g. NumPy using loadtxt().

I highly recommend making a new inversion bash file for each inversion, and changing the file/folder names. The script doesn't automatically make new folders, so if you want to do a new inversion, you should create a separate folder in bin/.

Visualization and diagnostics

All codes should work with Python 2 & 3

Multivariate Markov Chains can be visualized in beautiful and colorful ways and everyone does it differently. I have provided some scripts, one in R for effective sample sizes, Python for the rest, to visualize the chain in what I believe is an intuitive way.

The R scripts uses the coda and mcmcse packages. Expecially the implemntation from the multivariate effective sample size is rather novel. It is implemented using the methods in Vats et al. (2017).

Prerequisites

Python (pip) packages:

Numpy
Matplotlib
Seaborn
Pandas

R packages:

coda
mcmcse

Effective sample size

One of the most important things to check is if a chain is converged. One way (of many) to do this is to compute the effective sample size. Have we produced enough uncorrelated samples to accurately infer properties? The computation of this quantity is done in the R-script mcmcESS.R. From the bin/ folder for e.g. inversion1:

 Rscript ../analysis/mcmcESS.R inversion1/samples.txt 

For a sufficiently large chain, this script might take some tens of seconds. Vats et al. (2017) also provided a criterium to calculate a required effective sample size. This is also performed in this script.

If one requires more visualization on chain diagnostics, I highly recommend the coda R package. Especially the summary() function is a lot more sophisticated than the Python implementations.

Chain visualization

Visualizing the state of the Markov Chain is another way to get a good feeling of how MCMC works as well as to perform diagnostics. In the Python script chain.py I have provided ways to look at the samples and autocorrelations of each parameter. For this, I use the interactive mode of matplotlib; each time you close a figure, the new one will be generated. The code can be modified easily to save the figures in any required format. This can also be done from the GUI (as the plot is being shown).

From bin/ (with substituting the filenames/folders) for e.g. inversion1:

 python ../analysis/chain.py inversion1/samples.txt

Bayesian inference

The actual analysis of parameters is done typically with visualizing the means and correlation/covariance matrix. These paremeters completely define the posterior multivariate Gaussian. It is visualized in the following way from the bin/ folder for e.g. inversion1:

 python ../analysis/inference.py inversion1/samples.txt

The matrices are plotted and saved in the same folder as the samples file along with means and variances.

Also plotted is a boxplot of the parameters. As can be seen in the figures at the very beginning of this page, this might be problematic if the parameters span multiple orders of magnitude. In the two boxplots the same HMC results were use, except that in the linear scale plot the last two parameters were omitted. Use at your own discretion.

If you want to visualize a marginal of two parameters (e.g. 1 and 5) I have provided a hexagonal joint plot:

 python ../analysis/joint.py inversion1/samples.txt 1 5

Parameter numbering starts at 1. The two quantities in the legend are respectively the pearson correlation coefficient and it's two-tailed p-value (lower is more certain, zero is ideal).

Citation

If this code was helpful to one of your programmes I enjoy hearing about it! If you use the actual code to construct Markov Chains, I'd appreciate a reference to my latest work on it:

Implementation and tuning of Hamiltonian Monte Carlo for large linear inverse problems
Geophysical Research Abstracts
Vol. 20, EGU2018-14600, 2018
EGU General Assembly 2018
© Lars Gebraad, Andreas Fichtner 2018. CC Attribution 4.0 license.

FAQ's

Not many people have asked me questions yet. Feel free though!

References

Conrad Sanderson and Ryan Curtin.
Armadillo: a template-based C++ library for linear algebra.
Journal of Open Source Software, Vol. 1, pp. 26, 2016.

Multivariate Output Analysis for Markov chain Monte Carlo
D. Vats, J. M. Flegal, G. L. Jones, 2017
arXiv:1512.07713v4
https://arxiv.org/abs/1512.07713v4

R. M. Neal et al. 
MCMC using Hamiltonian dynamics. 
Handbook of Markov Chain Monte Carlo, 2(11), 2011.

A. Tarantola. 
Inverse problem theory and methods for model parameter estimation, 
volume 89. SIAM, 2005.