-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance is slow for tables with a lot of columns (and cells) #68
Comments
I wanted to see how the new kernel protocol over websocket improved the situation, but it is being implemented in JupyterLab 4.0, which is not compatible with this widget. |
Hi @davidbrochart! This extension doesn't have a 4.x version yet. It's something we should work on soon. But I don't think the websocket changes will improve this because you can actually save the notebook with all of its outputs in the document and open it without a kernel or any websocket traffic and it'll still be slow. This particular slowness is definitely a front-end phenomenon. |
Thanks @afshin, I managed to install from source. I'll run some benchmark anyway. |
I accumulated all traffic over websocket in Jupyter Server, and I can see that the new protocol is 5.5 times faster on this example: table = pd.DataFrame({k: range(1000) for k in range(1000)}) Total websocket traffic:
|
That's excellent and I'm glad to see this optimization! But even if the transfer was instantaneous, the front-end chokes trying to render this table. |
With a bigger table, the new protocol is now 18x faster: table = pd.DataFrame({k: range(1000) for k in range(10_000)}) Total websocket traffic:
|
In terms of data rate (from ZMQ to WebSocket), I got the following results on a consumer laptop (i7 @1.80 GHz, 16 GB of RAM):
|
Thanks for investigating this further, @davidbrochart! Just to clarify, are your results that this slowness is a data transmission / parsing issue? My guess was that this is a client-side performance issue; is that an incorrect assumption? Also, these figures might also be useful as a case study in your protocol alignment PR in JupyterLab. |
No, this doesn't explain the slowness of the beakerx display, which as you said is a front-end issue. Actually, these benchmarks were done with a table of 10_000_000 elements, which never displays in the notebook, but I can see that the transfer over the websocket is done on the server side. |
Bringing here a comparison of the same table using ipydatagrid that uses the Lumino DataGrid behind the scene too. So this confirms the lumino datagrid can achieve good performance. Next step profiling the code to figure out the bottleneck. |
beakerx_tabledisplay
performance takes a big hit and can freeze the browser when there are a lot of cells.To reproduce this issue:
Set up
Execute
This is the code I ran to manifest the issue. Here is a zipped notebook containing the code below: slow-tabledisplay.ipynb.zip
Cell 1
Cell 2
Note on JupyterLab 2 vs. 3
Interestingly, while this is slow in both JupyterLab 2.x and also JupyterLab 3.x, it seems slightly slower in JupyterLab 3, which causes this warning to arise in Firefox:
The text was updated successfully, but these errors were encountered: