Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Parse Large NBT Arrays #85

Open
bdkopen opened this issue Dec 31, 2023 · 9 comments
Open

Unable to Parse Large NBT Arrays #85

bdkopen opened this issue Dec 31, 2023 · 9 comments

Comments

@bdkopen
Copy link
Contributor

bdkopen commented Dec 31, 2023

I am attempting to parse NBT files that contain very large arrays. Whenever I do this, an error is thrown because the underlying node-protodef library caps arrays at a max size of 0xFFFFFF (16,777,215). According to the commit history, this was originally added to prevent OOM crashes.

Based on this NBT format specification, a byte array should be able to have a size of 2^31.

Support should be added to allow larger NBT array sizes to be parsed.

Example Code

const arrBuffer = await file.arrayBuffer();
const buffer = new Buffer(arrBuffer);

// Error happens here
const parsed = (await parse(buffer)).parsed;

Error

 ⨯ node_modules/protodef/src/compiler.js (70:11) @ readFn
 ⨯ Error: Read error for undefined : array size is abnormally large, not reading: 18571392
    at async uploadFile (./src/app/load/_actions/uploadFile.ts:22:21)

Example NBT File

This is an example NBT file that includes a very large byte array (size of ~18 million). Parsing will throw an error if executed on this file. No error is thrown when a NBT with smaller array sizes is parsed.
LargeNBTFile.schem.gz

@rom1504
Copy link
Member

rom1504 commented Dec 31, 2023 via email

@bdkopen
Copy link
Contributor Author

bdkopen commented Dec 31, 2023

We could add an option to that node protodef type to allow large arrays. Keep it off by default And then propagate that option to pnbt

This seems like a reasonable solution to me.

How come your file has such big arrays ?

I am working with Minecraft schematic files. Every single block has it's own index in the byte array, so a schematic with dimensions of 500 * 500 * 75 gets to a size of about 18 million, surpassing the limit. It is probably not an optimal format, but I am working with the standard that is already set.

@rom1504
Copy link
Member

rom1504 commented Dec 31, 2023

Ok makes sense, you may be interested by https://github.com/PrismarineJS/prismarine-schematic

@extremeheat
Copy link
Member

Some time ago (2020) there was problems with prismarine-nbt crashing V8 with OOM, so that check was meant to safely fail. I was thinking that check could be replaced with a simple if (arrayLength > (offset + buffer.length)) throw() check, but actually we can't do that since we don't know the size of the array items (which may be variable) before iterating. Since compiler is available to the type gen code it should not be hard to delete that line based on condition like if (compiler.skipChecks == true) skip.

@rom1504
Copy link
Member

rom1504 commented Jan 1, 2024 via email

@extremeheat
Copy link
Member

extremeheat commented Jan 2, 2024

Because then every instance in the protocol schema, nbt.jsons and all the protocol.json that uses an array type would need to be updated. Otherwise the check would not be consistent

@rom1504
Copy link
Member

rom1504 commented Jan 2, 2024 via email

@extremeheat
Copy link
Member

extremeheat commented Jan 2, 2024

Right, but then users would need to have a way to update the type in the relevant part of the schema to toggle the behavior

@bdkopen
Copy link
Contributor Author

bdkopen commented Apr 20, 2024

@extremeheat and @rom1504, I have an open PR in node-protodef to address the proposed skipChecks option. Once that PR is reviewed and published, I have a PR ready for this repository to use the new compiler option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants