Here we include the code for reproducing the experiments of lossless image compression on the dynamically binarized EMNIST dataset.
The overall pipeline consists of:
- training VAE models on the (training) dataset
- evaluating the ideal bitrates on the (test) dataset (i.e., the neg. variational bound of trained models)
- compress the (test) dataset with trained models
python train.py --config configs/train_bin_emnist.yml [ARGUMENT_LIST]
The basic training arguments are specified in the config file configs/train_bin_emnist.yml
and can be overrided by the [ARGUMENT_LIST]
(see mcbits/argsparser.py
for the argument list).
The key arguments for training include:
--exp_name
: the training experiment name used as the experimental directory path for logging and saving the model checkpoint--gpu
: the GPU used for training (set to-1
for using CPU)--dataset
: the dataset used for training (see datasets indatasets/
)--split
: the split of dataset, exclusively used forEMNIST
dataset (choices:{mnist, letters}
)--bound
: the variational bound for training VAEs (choices:{ELBO, IWAE}
)--num_particles
: the number of particles for training the model
After training, the training arguments are saved as a training config file train_config.yml
in the experiment directory
specified by --exp_name
, which will be be reloaded when compressing.
The model checkpoint model.pt
is saved in the same directory.
python compress.py --config configs/compress_bin_emnist.yml [ARGUMENT_LIST]
The basic compressing arguments are specified in the config file configs/compress_bin_emnist.yml
and can be overrided by the [ARGUMENT_LIST]
(see mcbits/argsparser.py
for the argument list).
Compressing shares some arguments with training, such as --exp_name
, --gpu
, --dataset
, --split
, etc., but are
specified for compressing experiments.
The key arguments for compressing include:
--train_config
: the saved training config (specifying the model and the checkpoint loaded for compression)--num_compress
: the (maximum) number of data samples to compress--batch_compute
: whether batch the computation of conditional likelihood over particles (ifTrue
, it might lead to decode check failure due to some non-deterministic computation in PyTorch--decode_check
: whether run decoding and check the correctness of decoded results--coder
: the coder to use for compression (see the supported coder list inCODER_LIST
inmcbits/coders.py
)--num_particles
: number of particles for compression (for BB-IS and BB-CIS)--lprec
,--bprec
: the precisions for the rANS stack--log_num_bucket
,--prior_mprec
,--prop_mprec
,--cond_mprec
: the precisions for discretizing the latent and observation distributions--iterative_post_improvement
: where apply amortized iterative inference to improve posterior predictions--iterative_improvement_steps
,--iterative_improvement_lr
: the number of optimization steps and learning rate for iterative inference
The ideal bitrates (i.e., the neg. variational bound of trained models) are first evaluated on the (test) dataset. Then the (test) dataset is compressed with the specified coder and the trained model, the true net bitrates and the total bitrates (plus initial bits) are computed.
We provide the training configs and model checkpoints in checkpoints/
for reproducing our experiments.
Each directory corresponds to a training run, named as train_bin_emnist_split=[SPLIT]_num=[NUM_PARTICLES]
. By default,
--num_compress=500
images from the test set are compressed for each experiment. To compress the whole test set, set
--num_compress=10000
for the mnist
split and --num_compress=20800
for the letters
split. Sometimes, the quantization
gap (quantified by the gap between the net bitrate and the ideal bitrate) is not negligible, we recommend tuning the precision
parameters if needed.
With the provided checkpoints, you can measure both in- & out-of-distribution compression performance of BB-IS (by modifying
the --split
argument) and the effect of increasing the number of particles (by using models trained with different
particle numbers and modifying the --num_particles
argument when compressing). We provide results below for reference,
where each setting means TRAIN SPLIT
→ COMPRESS SPLIT
, and net bitrates (bits/dim) are compared.
Method / Setting | MNIST → MNIST | MNIST → Letters | Letters → Letters | Letters → MNIST |
---|---|---|---|---|
BB-ELBO | 0.2382 | 0.3068 | 0.2471 | 0.2596 |
BB-IS (5) | 0.2335 | 0.2880 | 0.2413 | 0.2512 |
BB-IS (50) | 0.2305 | 0.2784 | 0.2371 | 0.2462 |
Savings | 3.2% | 9.3% | 4.0% | 5.2% |
By turning setting the --iterative_post_improvement
argument to True
, you can also compare BB-IS with iterative inference and combine
them together to get further improvement. We provide results below for reference, where IF (50)
denotes --iterative_improvement_steps=50
.
Method / Setting | MNIST → MNIST | MNIST → Letters |
---|---|---|
BB-ELBO | 0.2382 | 0.3068 |
BB-ELBO-IF (50) | 0.2351 | 0.2915 |
BB-IS (50) | 0.2305 | 0.2784 |
BB-IS (50)-IF (50) | 0.2291 | 0.2702 |
Savings | 3.8% | 11.9% |
By changing the --coder
to CISBitsBackCoder
, you can compare BB-IS with its coupled variant (BB-CIS), especially in terms
of the total bitrate (including the initial bit cost). We provide results below for reference, which is done in the
MNIST
→MNIST
setting, and each cell represents net bitrate/total bitrate.
N / Method | BB-IS | BB-CIS |
---|---|---|
1 | 0.2382 / 0.2393 | 0.2378 / 0.2400 |
5 | 0.2334 / 0.2382 | 0.2333 / 0.2355 |
50 | 0.2305 / 0.2782 | 0.2303 / 0.2325 |