This repository contains the official implementation for Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding (ICML 2021 Oral).
Install the conda environment:
conda env create -f environment.yml
conda activate mcbits
pip install -e .
If you update the environment, please also update the environment file:
conda env export --name mcbits | grep -v "^prefix: " > environment.yml
The McBits implementation are included in the mcbits
directory, where the rANS is implemented in mcbits/rans.py
and all McBits coders are in mcbits/coders.py
.
The code for reproducing our experiments is in the experiments
directory. We include the toy experiment (experiments/toy_data
),
lossless image compression on the EMNIST dataset (experiments/img_emnist
), and loessless sequential data compression
on pianoroll datasets (experiments/seq_pianoroll
). Each includes a README for executing the
pipeline and reproducing the results.
Note: the current rANS implementation is not parallelized. The faster parallel implementation based on JAX will be released soon!
If you find this work relevant to your work, please cite our paper:
@inproceedings{ruan2021improving,
title={Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding},
author={Ruan, Yangjun and Ullrich, Karen and Severo, Daniel and Townsend, James and Khisti, Ashish and Doucet, Arnaud and Makhzani, Alireza and Maddison, Chris J},
booktitle={International Conference on Machine Learning},
year={2021},
}
The rANS implementation is closely based on the note by Fabian Giesen and the bits-back repo.