Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in the multimodal processor of Phi-3-vision #851

Open
hiro28844 opened this issue Aug 29, 2024 · 4 comments
Open

Memory leak in the multimodal processor of Phi-3-vision #851

hiro28844 opened this issue Aug 29, 2024 · 4 comments
Assignees

Comments

@hiro28844
Copy link

hiro28844 commented Aug 29, 2024

Describe the bug
When calling the Phi-3-vision multimodal processor, a memory leak appears to occur, causing memory usage to continuously increase.

To Reproduce
Run the following script:

from io import BytesIO
from tempfile import mkstemp

import requests
from PIL import Image

import onnxruntime_genai as og


print("Loading model...")
model = og.Model("/path/to/Phi-3-vision-128k-instruct-onnx-cpu/cpu-int4-rtn-block-32-acc-level-4")
processor = model.create_multimodal_processor()

response = requests.get("http://images.cocodataset.org/val2017/000000039769.jpg", timeout=30)
_, image_path = mkstemp(suffix=".png")
Image.open(BytesIO(response.content)).convert("RGB").save(image_path)
image = og.Images.open(image_path)

while True:
    prompt = "<|user|>\n"
    prompt += "<|image_1|>\n"
    text = "Please describe the image in detail."
    prompt += f"{text}<|end|>\n<|assistant|>\n"
    print("Processing image and prompt...")
    r = processor(prompt, images=image)
    del r

Expected behavior
Memory usage remains constant no matter how many times the multimodal processor is called.

Desktop (please complete the following information):

  • OS: Ubuntu 20.04.6 LTS
  • onnxruntime-genai==0.4.0
@natke
Copy link
Contributor

natke commented Aug 31, 2024

Hi @hiro28844, does it remain constant or grow? We do pre-allocate memory for the KV-cache to improve performance

@hiro28844
Copy link
Author

Hi @natke ,

does it remain constant or grow?

It continues to grow. Attach a video when I run the script above.

memory_leak.mp4

@i-dubits
Copy link

Confirm memory leak in Phi-3-vision. Probably this is related to the following issue: #590
But fixing 'max_length' parameter suggested by @PatriceVignola does not change anything for me. Probably the image-text processing is somewhat different from text alone.
I am using cuda onnx-genai version. The GPU memory remains constant but CPU memory increases every iteration

@hiro28844
Copy link
Author

@natke
Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants