Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming doesn't work when there are more than a few conversations #79

Open
avelican opened this issue Aug 16, 2023 · 5 comments
Open

Comments

@avelican
Copy link
Contributor

avelican commented Aug 16, 2023

Edit: Issue affects all browsers. The reason I thought it was Brave-only is because it is my main browser, so the database had the most messages, causing the most lag.


Hello. This might be a bug with Brave, but I haven't had this issue with any other streaming LLM web interface, so I thought it was worth reporting.
Streaming doesn't work in Brave. What I mean by this is that it will print either the first token, or nothing, freeze for the next 20 seconds, and then print the whole result. (While using 100% CPU the whole time.)
Using latest Brave:
Version 1.57.47 Chromium: 116.0.5845.96 (Official Build) (64-bit)

Same issue occurs with Brave Shields down. (+All extensions disabled.)

I think I've seen streaming work in Brave a few times, so the issue may not occur every time.
(I installed it locally to see if that would help, and it apparently did, but then it went back to being extremely laggy.)

@avelican
Copy link
Contributor Author

avelican commented Aug 17, 2023

Update: Streaming works when there are very few conversations. Clearing data fixes the problem. Will investigate further.

@avelican avelican changed the title Streaming doesn't work in Brave Browser Streaming doesn't work when there are more than a few conversations Aug 18, 2023
@avelican
Copy link
Contributor Author

Update: Exported data and tested in other browsers. Issue is not specific to Brave but occurs everywhere.

@avelican
Copy link
Contributor Author

Update: Issue seems to be due to IndexedDB being slow. (In the profiler, most of the CPU time is spent on "System")
I tested updating the DB only on every 3rd message, which "solves" the problem in the sense that it still lags horribly (updates once per paragraph) but at least it's "kind of" streaming again now.

(What confuses me is that 50% of the CPU time is still Idle, so I don't understand why it already lags so hard.)

I don't see an easy fix here. These are the options that I see:

  1. Use a different DB backend. If IndexedDB is performing so poorly, perhaps a better option exists. (I don't know enough about this area to comment further, and it seems like a lot of trouble to change the whole DB backend...)
  2. Commit messages to DB less frequently?

However, as far as I can tell, messages are displayed by the message components loading them from the database.
Therefore, saving less frequently == rendering less frequently, i.e. create artificial lag! Bad UX!

How often to save to DB? Even if we alter the app to use some local cache for the message, and save only when a message finishes streaming, this complicates knowing when the message is "finished". Naive solution is getStreamContent getting isFinal == true. But if network transmission fails, this message will not arrive, and we still want to save what was received...

To my mind now (though it is late here), the best compromise seems:

  1. Use some local cache for rendering the message (so that saving less frequently does not break streaming).
  2. Save on every token, but with a lastSave timestamp so that it does not save more than once per 2 seconds or so, so it doesn't overload IndexedDB? Edit: This would still suffer from a network interruption not saving the last few tokens...

Again, this whole theory is based on the assumption that IndexedDB itself is the culprit. Browser being 50% Idle seems to disprove that, but maybe it's talking total CPU cores and IndexedDB is running on one thread and jamming up? I don't know.

Will take another look tomorrow...

@Belphemur
Copy link

Belphemur commented Sep 6, 2023

Performance issue aren't related to IndexedDB, but actually to the way the app is built.
Every time you write something in the message box, every message is re-rendered.
Every time a message is updated (streaming) the whole conversation is re-rendered.

I fix all that in my fork:
https://github.com/belphemur/chatpad

@Supernova3339
Copy link
Collaborator

Supernova3339 commented Sep 6, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants