Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combined version of LMDB mdb.master and mdb.master3 #278

Draft
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

Kerollmops
Copy link
Member

@Kerollmops Kerollmops commented Aug 20, 2024

This PR implements #51, taking a large inspiration from the im-rs crate .

It uses a second heed3/Cargo.toml with the necessary dependencies. When you need to work or publish the heed3 crate, you need to cp heed3/Cargo.toml heed/Cargo.toml. The examples were moved out of the standard heed/examples/ folder to make them compile when working on both crates.

There are three crates now...

  • heed: the one you know. Based on LMDB 0.9 (the mdb.master branch).
  • heed3: Based on LMDB 1.0 (the mdb.master3 branch), without support for encryption, will eventually support checksumming.
  • heed3-encryption: The same as heed3 but with support for encryption.

...but I want to only have two

  • heed: the one you know. Based on LMDB 0.9 (the mdb.master branch).
  • heed3: Based on LMDB 1.0 (the mdb.master3 branch), with support for encryption through the EncryptedDatabase and EnvOpenOptions::open_encrypted type and method and eventually support checksumming.

Note that if we duplicate the code of the Database for the EncryptedDatabase type, we no longer need the following proc-macro as we will copy/paste and change the method signature, i.e., &RoTxn -> &mut RoTxn, by hand.

How it works?

By annotating all the heed methods that use a &RoTxn this way:

#[heed_master3_proc_macro::mut_read_txn(rtxn)]
fn get<'t>(&self, rtxn: &'t RoTxn, key: &[u8]) -> heed::Result<&'t [u8]>;

It transforms the function signature to use a &mut RoTxn:

fn get<'t>(&self, rtxn: &'t mut RoTxn, key: &[u8]) -> heed::Result<&'t [u8]>;

This way, we ensure that users do not keep pointers for potentially invalid bytes from LMDB between two get/put operations. It's a limitation of LMDB when you use the encryption feature. The LMDB pages are decrypted on the fly in a buffer that cycles. As a result, only a restricted amount of values' pointers are valid until the next operations.

To Do

  • Modify the CI to test the heed3 crate.
  • Create and use the lmdb-master3-sys based on the mdb.master3 branch.
  • Introduce two EnvOpenOptions/EnvEntry?
  • Retrieve the work from Expose the LMDB Encrypt/Decrypt feature #134 for encryption support (and update the dependencies).
  • Add an explanation about the fact that there are three crates and why
    • In the README
    • In the documentation
  • Can we make the doc tests to pass, or be conditional on the branch?
  • Publish the lmdb-master3-sys and heed3 crates on crates.io.
  • Add the support for checksumming too (optional)

@Kerollmops Kerollmops marked this pull request as draft August 20, 2024 12:48
@Kerollmops Kerollmops force-pushed the combined-lmdb-support branch 2 times, most recently from f16bd66 to 92104a4 Compare August 20, 2024 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant