-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLAKE2b gadget #1767
base: v2
Are you sure you want to change the base?
BLAKE2b gadget #1767
Conversation
Co-authored-by: Martin Minkov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! I left some initial comments after a first pass
} | ||
|
||
|
||
function testVectors() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are some of these test vectors from? we should use some official ones also (if they arent the official ones!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Official test vectors have key parameter which I didn't implement. Therefore, I had to create test vectors manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would in any case include tests with longer preimages, to make sure the code works when two blocks are required (meaning t1 is nonzero). I believe the longer preimage in these tests was 43 bytes. Let's try with 128 for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, I had to create test vectors manually.
How?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hashed messages from other hash function unit tests using nobleBlake2b
. Also, I added longer messages with @querolita 's recommendation.
I changed the base branch, which reveals that you didn't merge that branch but added it as a single squashed commit. I highly recommend to never do that! it means we have the same changes in two different commits, which git can't resolve and instead will mark them as conflicts. (see "This branch has conflicts that must be resolved" on this PR now) the preferred way to use git at o1Labs is to basically always join branches using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few quick comments on the gadgets additions!
if (quotientBits === 1) { | ||
quotient.assertBool(); | ||
} else { | ||
rangeCheckN(quotientBits, quotient); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for efficiency this should use rangeCheck64()
instead of rangeCheckN()
if quotientBits === 64
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quotientBits is not necessarily 64 since it's calculated from nBits. So, I think it's better to keep rangeCheckN()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested to use rangeCheck64
if quotientBits === 64. so what I mean is, test for that condition, and use the efficient range check in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, got it. This improvement is relevant for divMod32 too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it's not! there is no dedicated single-gate 32-bit range check, like there is for 64 bits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know that! I removed divMod32 change with the latest commit. AFAIU from the kimchi page of the proof systems book, only 64-bit range check has a dedicated single-gate. Quoting from the book:
The RangeCheck0 gate can also be used on its own to perform 64-bit range checks by constraining witness cells 1-2 to zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is true that only range checks for 64bits have a dedicated Kimchi gate. However, one can also perform 32bit range checks just less efficiently. Maybe let's not remove the o1js function for it as it might be better from a ux point of view, instead of asking the zkapp developer to use a rc64 for x and another one for x * 2^32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 32 bits there are two different ways to do it in 2.5 gates, and one of them is already used -- the one that doesn't use the range check gate.
Using that one is better because the range check gate and lookup table are only optionally included in the proof system, and the proof gets slower by including them. So it only pays off if you need a lot of range checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR 😄
For today I am leaving some comments regarding the arithmetic side of this PR. Tomorrow I will look deeper into the actual hash function and leave feedback there as well.
divMod64, | ||
|
||
/** | ||
* Addition modulo 2^64. The operation adds two {@link Field} elements in the range [0, 2^128] and returns the result modulo 2^64. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Addition modulo 2^64. The operation adds two {@link Field} elements in the range [0, 2^128] and returns the result modulo 2^64. | |
* Addition modulo 2^64. The operation adds two {@link Field} elements in the range [0, 2^128) and returns the result modulo 2^64. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I am not sure which one is right. The ranges in addMod32
comment block doesn't match too. I took the comment for addMod64
from addMod32
since it's the exact same function but modulo 2^64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first of all, all ranges are always of the form [a, b)
i.e. open on the upper end
second, 128 bits is wrong. it uses a 1-bit range check on the quotient. for that to work, the addition result has to be at most 65 bits
also, you say further below that "the gadget assumes both inputs to be in the range [0, 2^64)". that will definitely work and is the intended use case, so stick with that here in the description as well
(x, y) => { | ||
if (x >= 2n ** 64n || y >= 2n ** 64n) | ||
throw Error('Does not fit into 64 bits'); | ||
return x & y; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you return &
in this case but Bitwise.or
in the next?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! That should be |
not &
. Interestingly, it doesn't fail the unit test. Any ideas, @mitschabaude?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my only idea is that you use such a small number of runs
that with high likelyhood, one of the two inputs is bigger than
try it with uint(64)
as the random generator instead of maybeField
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can actually remove this unit test that creates a proof, and stick with the ones above that only run a circuit and check constraints
The proof one will make CI slower and provides hardly any additional value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While developing smart contracts, sometimes it works without proofs but fails with them. Isn't it beneficial to test with proofs. Is it different for primitives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's different here yeah, for various reasons.
For one, we are already testing many extremely similar methods with proofs in this file, the additional surface covered is tiny
function divMod64(n: Field, nBits = 128) { | ||
assert( | ||
nBits >= 0 && nBits < 255, | ||
`nBits must be in the range [0, 255), got ${nBits}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 255 not included because of Pasta curves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why we don't include 255. I followed the range checking practice used in divMod32
. Maybe @mitschabaude knows the reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the field size is less than 2^255, there are 255 bit integers that have two different representations in the field (in fact that's true for almost all 255 bit integers since the field size is just slightly over 2^254)
That non-uniqueness would make the divmod gadget unsound: the prover could pick between two different quotient/remainder splits, one of which is wrong.
} | ||
|
||
function addMod64(x: Field, y: Field) { | ||
return divMod64(x.add(y), 128).remainder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning behind nBits
being 128 in this case? (Instead of a larger or smaller value perhaps). Is it to prevent the quotient from being larger than 64 bits upon reduction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah if both inputs are 64 bits then the result is 65 bits, that would cause an efficient (boolean) check for the quotient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the question: can't we assume that the inputs are 64 bits and therefore use nBits=65?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it should be 65. It used to be 128 at some point. I adapted the changes made in #1763 and forgot to update nBits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished my first pass, left some comments and questions 👍🏻
src/lib/provable/gadgets/blake2b.ts
Outdated
for (let i = 0; i < input.length; i++) { | ||
if (state.buflen === 128) { | ||
state.t[0] = state.t[0].add(128); | ||
if (state.t[0].equals(UInt64.zero)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain in what situations this conditional would succeed? I am asking because it seems like the counter t[0]
gets updated using a normal add
, whereas the t[1]
uses a modular addition instead. Are you thinking of a UInt64
overflow? If so, maybe it's better to use addMod64
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw @boray this is a bug. you can't use a Bool
as condition, it's an object, always truthy!
I highly recommend to do out-of-circuit logic on plain JS values, like bigint to avoid mistakes like this, i.e. here I would use
if (state.t[0].toBigint() === 0n) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah but wait, is this supposed to be circuit code and state.t[0]
is a variable? then you fundamentally can't use a JS if condition anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state.t[0]
is a variable but I am not sure if that logic should be in-circuit or not. I added the in-circuit version as a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this part with reference to the snippet below from the BLAKE2B RFC.
ctx->t[0] += ctx->c; // mark last block offset
if (ctx->t[0] < ctx->c) // carry overflow
ctx->t[1]++; // high word
src/lib/provable/gadgets/blake2b.ts
Outdated
UInt64.from( | ||
buf[i * 8] | ||
.toUInt64() | ||
.or(buf[i * 8 + 1].toUInt64().leftShift(8)) | ||
.or(buf[i * 8 + 2].toUInt64().leftShift(16)) | ||
.or(buf[i * 8 + 3].toUInt64().leftShift(24)) | ||
.or(buf[i * 8 + 4].toUInt64().leftShift(32)) | ||
.or(buf[i * 8 + 5].toUInt64().leftShift(40)) | ||
.or(buf[i * 8 + 6].toUInt64().leftShift(48)) | ||
.or(buf[i * 8 + 7].toUInt64().leftShift(56)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be done with addition and multiplication instead of or()
and leftShift()
e.g.
- .or(buf[i * 8 + 1].toUInt64().leftShift(8))
+ .add(buf[i * 8 + 1].value.mul(1n << 8n))
in a circuit, addition and multiplication is cheap while bitwise operations are expensive!
you can also leave it and we do a second pass for efficiency, and you focus on correctness for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware that add/mul is cheaper than bitwise ops. Let's focus on correctness now and leave this improvement to the efficiency pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, as a rule of thumb, binary operations which are cheap on CPU are expensive in cryptography (where arithmetic operations in a field are the "cheap" operations instead). Modeling binary operations in a circuit becomes a big overhead, so anytime where binary ops can be replaced with arithmetic counterparts, is preferred for efficiency reasons.
Either way, let's not forget that second pass for efficiency in a later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@boray you can think of add()
, mul()
and assertEquals()
as the primitive operations out of which everything else is built. nothing else is as efficient, and most things are far less efficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the end, all of this turns into polynomial equations, and polynomials are made up of addition and multiplication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored in the last commit!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving two more comments before acceptance. These are related to the actual constraints being created behind the scenes.
/* | ||
state.t[1] = state.t[1].add( | ||
Provable.if( | ||
state.t[0].lessThan(UInt64.from(state.buflen)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this needs to go in the circuit, then the same lines of the update
part should also be written in circuit manner. In any case, would it make sense to have a final constrain checking t[1]*2^64+t[0]=total_length
? That, together with the modular additions gives you a correct decomposition.
Now, going back to what @mitschabaude said about the state.t[0]
being a variable, one cannot decide the design of the circuit depending on the value of a witness cell. The circuit must have the same "universal shape", regardless of the concrete values of the variables.
So putting all the above together, it's not a matter of whether that logic needs to go into the circuit or not. It's about not being able to decide upon the concrete value a variable takes. Thus, I believe this part should be rewritten, perhaps with the use of Provable.if
(but I might be missing the right o1js know-how, so maybe ask another team member for further insights).
At least in Kimchi, one would create a circuit that is already "bifurcated". Meaning, all the constraints hold and they are the same, no matter what concrete values are passed to the prover. So we play with multiplications to "select" one branch or another, but both sides should eventually evaluate to 0 to satisfy the constraint system.
for (let { digest_length ,preimage, hash } of testVectors()) { | ||
let actual = Gadgets.BLAKE2B.hash(Bytes.fromString(preimage), digest_length); | ||
expect(actual.toHex()).toEqual(hash); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests should run in a circuit!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added in-circuit tests in the last commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it, but there were two misunderstandings:
- I didn't mean to run them in a ZkProgram with proving, but just in a circuit with checking of constraints. You can use
runAndCheckSync()
for that like in other test files - I meant run the same test vectors in a circuit, not a small number of random inputs like you did
|
||
type State = { | ||
h: UInt64[]; | ||
t: UInt64[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as I can tell, t
never contains variables, just constants that change depending on the input size (which is also constant)
therefore, it's better to change the type to bigint or number, to avoid confusion with variables (see other comments where we were confused):
Also, nicer to use a fixed-size array since it's just two elements.
t: UInt64[]; | |
t: [bigint, bigint] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, t
is set to [Uint64.zero, Uint64.zero]
at initialisation. t[0]
is incremented by buflen
during every update()
and final()
. However, t[1] is only used when input size is larger than 2^64 bytes which is unlikely. We can brake up the t
into two Uint64
s instead of an array, but would it improve efficiency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buflen
is not a variable though, it only depends on the input length
so I still claim you can and should use bigints for it. initialize them to [0n, 0n]
.
We can brake up the t into two Uint64s instead of an array, but would it improve efficiency?
the suggestion to change the type has nothing to do with efficiency, just with making the type more precise. but my main point was to change the UIn64 to bigint!
/* | ||
state.t[1] = state.t[1].add( | ||
Provable.if( | ||
state.t[0].lessThan(UInt64.from(state.buflen)), | ||
UInt64.one, | ||
UInt64.zero | ||
) | ||
); | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was confused because the non-provable version of this part didn’t raise any errors. But, I believe this logic needs to be provable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it doesn't!
divMod64, | ||
|
||
/** | ||
* Addition modulo 2^64. The operation adds two {@link Field} elements in the range [0, 2^128] and returns the result modulo 2^64. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first of all, all ranges are always of the form [a, b)
i.e. open on the upper end
second, 128 bits is wrong. it uses a 1-bit range check on the quotient. for that to work, the addition result has to be at most 65 bits
also, you say further below that "the gadget assumes both inputs to be in the range [0, 2^64)". that will definitely work and is the intended use case, so stick with that here in the description as well
await equivalentAsync({ from: [maybeField, maybeField], to: field }, { runs })( | ||
(x, y) => { | ||
if (x >= 2n ** 64n || y >= 2n ** 64n) | ||
throw Error('Does not fit into 64 bits'); | ||
return x | y; | ||
}, | ||
async (x, y) => { | ||
let proof = await Bitwise.or(x, y); | ||
return proof.publicOutput; | ||
} | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say remove this, the in-circuit equivalence tests above are much more thorough and creating a proof here just makes CI slower
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I meant just this test but you removed all of them. I think it's not a bad idea to remove most of them, but can you at least keep 1-2 of the zkprogram tests?
@@ -204,6 +208,10 @@ | |||
"Poseidon": { | |||
"rows": 208, | |||
"digest": "afa1f9920f1f657ab015c02f9b2f6c52" | |||
}, | |||
"BLAKE2B": { | |||
"rows": 6012, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
state.t[1] = state.t[1].addMod64(UInt64.one); // high word | ||
} | ||
state = compress(state, false); // compress (not last) | ||
state.buflen = 0; // counter to zero |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also reset state.buf
to an empty array? If you don't do this, you'll start overwriting buf
at the start but the rest of it will still contain the old entries.
and if yes, wouldn't it be less confusing in general to always use state.buf.length
instead of an extra parameter buflen
, so that they can't get out of sync?
out[i] = UInt8.from( | ||
state.h[i >> 3].rightShift(8 * (i & 7)).and(UInt64.from(0xff)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you aim for efficiency: here you can also use arithmetic instead of bitwise operations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to replace this line with state.h[i >> 3].div(2 ** (8 * (i & 7))).mod(0xff)
but it increased the row number a lot. I am not sure if I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check out the bytesToWord function that's used in the keccak and sha2 implementations!
Probably a good idea to use that function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant wordToBytes()
btw 😅
to: divModOutput, | ||
})( | ||
(x) => { | ||
assert(x < 1n << 128n, `x needs to fit in 128bit, but got ${x}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion fails because the random field is larger than 2^128. It works for divMod32 above, which is smaller, and that should fail first. Any ideas, @mitschabaude?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the test checks whether the two functions are equivalent.
so the test succeeds if both functions throw an error for the same input.
so, it's not necessarily the case that "divMod32 should fail first". Maybe, divMod32()
correctly raises an error if the input is larger than 2^64.
if this test fails, my idea would be to check why divMod64()
doesn't throw an error on inputs larger than 2^128
Closes #1754
Ingredients:
divMod64
andaddMod64
needed for compression functionGadgets.or()
UInt64.or()