Skip loading of the model during conversion to save on RAM usage #83
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Helps address #69 somewhat, by preventing coremltools loading the converted model/compiling it post-conversion.
I don't believe exporters is doing anything with either the compiled nor the loaded model (other than just using the spec), so it is probably fine to default to skipping loading the model. I've also seen a recent example this month publishes in swift-transformers that also skipped this step: sample code here.
However if desired, I also considered adding in a new argument that would allow users to set this themselves if we don't want to force this default behavior onto people.
I can also add a new section to the troubleshooting section specific to this error if that would be helpful.
Test setup:
M3 Max Macbook Pro with 36GB of RAM
Converting model: Undi95/MythoMax-L2-Kimiko-v2-13b (~26GB model)
Current behavior w/o this fix:
Getting a
zsh: killed python3 -m exporters.coreml ...
error due to running out of memory (see issue for more details).Behavior with this fix:
Able to successfully convert the provided model and similar ones of its size (although much larger than my system's RAM will still not work). This will at least make this tool more accessible for models that should fix inside your RAM size.