Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saving/sync broken #85

Open
anpc opened this issue Apr 28, 2015 · 8 comments
Open

saving/sync broken #85

anpc opened this issue Apr 28, 2015 · 8 comments

Comments

@anpc
Copy link

anpc commented Apr 28, 2015

UPDATE: It seems that whenever one page was opened in 2 places (by you or another person),
and you edited the first line, the other editor could become corrupt and silently fail to save subsequent edits.
THIS BUG HAS BEEN THERE FOR ABOUT HALF A YEAR. You may have lost a lot :-(

Email me [email protected] and I'll try to help. Unfortunately if loss occurred, I'm afraid nothing was sent to the servers, so not sure I'll be able to help.

I've found and fixed the bug; this issue will remain open until I add a second layer of protection (#56) from future similar bugs...


[written from Anat's name but actually by @cben]
In this case I believe I haven't lost edits but syncing did fail — silently — which is a scary data loss scenario!

  • I started writing on my computer.
  • Then I also opened same doc on Beni's computer and edited there.
  • JS console on my computer started showing lot's of Both operations have to have the same base length errors. It seems that changes from Beni's computer do appear on mine.
  • my computer's flaky internet connection went and came several times during this time...
  • Trying to edit on my computer added new kinds of errors, notably sendOperation() called with invalid operation. — and did not reflect the edit on Beni's.
  • Beni's JS console is still clean.
...
firebase.js:43 Uncaught Error: Both operations have to have the same base length
render-math.js:277 20491.784s change Object
firebase.js:43 Uncaught Error: Both operations have to have the same base length
render-math.js:277 20492.923s change Object
firebase.js:43 Uncaught Error: Both operations have to have the same base length
render-math.js:277 20498.173s change Objectfrom: CodeMirror.Posorigin: "cmrt-353"removed: Array[1]text: Array[2]to: CodeMirror.Pos__proto__: Object
firebase.js:43 Uncaught Error: Both operations have to have the same base length
3070 console messages are not shown.
render-math.js:277 20902.045s change Object {from: CodeMirror.Pos, to: CodeMirror.Pos, text: Array[1], origin: "cmrt-354", removed: Array[1]}
render-math.js:277 20902.292s change Object {from: CodeMirror.Pos, to: CodeMirror.Pos, text: Array[1], origin: "cmrt-355", removed: Array[1]}
2firebase.js:43 Uncaught Error: Both operations have to have the same base length
firebase.js:56 WebSocket connection to 'wss://s-dal5-nss-33.firebaseio.com/.ws?v=5&ns=mathdown' failed: Error in connection establishment: net::ERR_NETWORK_CHANGED
firebase.js:59 WebSocket connection to 'wss://s-dal5-nss-33.firebaseio.com/.ws?v=5&ns=mathdown&s=zqxI1NLKUeDK274eQUsNbwFOrkPHmNhW' failed: WebSocket is closed before the connection is established.
firebase.js:70 GET https://s-dal5-nss-33.firebaseio.com/.lp?id=1193443&pw=3EKOFbF5GQ&ser=98754…pudWxsLCJzbiI6IkIxYyJ9XSwiaCI6IjVFOHZTOEpaU1JObVIxUnRKbmlKUHNiajRDND0ifX19 net::ERR_CONNECTION_TIMED_OUT
.lp?dframe=t&id=1193443&pw=3EKOFbF5GQ&ns=mathdown:5 Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user's experience. For more help, check http://xhr.spec.whatwg.org/.
.lp?dframe=t&id=1193443&pw=3EKOFbF5GQ&ns=mathdown:6 GET https://s-dal5-nss-33.firebaseio.com/.lp?disconn=t&id=1193443&pw=3EKOFbF5GQ 
firebase.js:70 GET https://s-dal5-nss-33.firebaseio.com/.lp?id=1193443&pw=3EKOFbF5GQ&ser=98754…JPZ1cvdXNlcnMvLUpvLTV4bjI2UDFOU2hDd0pFYmgvY29sb3IiLCJkIjoiI2ZmZDZiMyJ9fX0. net::ERR_CONNECTION_TIMED_OUT
firebase.js:70 GET https://s-dal5-nss-33.firebaseio.com/.lp?start=t&ser=90668494&cb=2000&v=5&ns=mathdown net::ERR_CONNECTION_TIMED_OUT
render-math.js:277 22782.053s change Object {from: CodeMirror.Pos, to: CodeMirror.Pos, text: Array[1], origin: "+input", removed: Array[1]}
firepad.js:1673 Uncaught Error: sendOperation() called with invalid operation.firepad.js:1673 assertfirepad.js:1430 firepad.FirebaseAdapter.FirebaseAdapter.sendOperationfirepad.js:2383 firepad.EditorClient.EditorClient.sendOperationfirepad.js:2087 firepad.Client.Synchronized.applyClientfirepad.js:2052 firepad.Client.Client.applyClientfirepad.js:2352 firepad.EditorClient.EditorClient.onChangefirepad.js:2275 EditorClient.editorAdapter.registerCallbacks.changefirepad.js:4293 firepad.RichTextCodeMirrorAdapter.RichTextCodeMirrorAdapter.triggerfirepad.js:4176 firepad.RichTextCodeMirrorAdapter.RichTextCodeMirrorAdapter.onChangefirepad.js:4370 obj.(anonymous function)firepad.js:48 clazz.triggerfirepad.js:3510 firepad.RichTextCodeMirror.RichTextCodeMirror.onCodeMirrorChange_firepad.js:4003 obj.(anonymous function)codemirror.js:7507 CodeMirror.signalcodemirror.js:2302 endOperation_finishcodemirror.js:2193 endOperationscodemirror.js:2176 endOperationcodemirror.js:2650 readInputcodemirror.js:2736 (anonymous function)
render-math.js:277 22785.437s change Object {from: CodeMirror.Pos, to: CodeMirror.Pos, text: Array[1], origin: "+delete", removed: Array[1]}
firepad.js:1673 Uncaught Error: sendOperation() called with invalid operation.
render-math.js:277 22786.235s change Object {from: CodeMirror.Pos, to: CodeMirror.Pos, text: Array[1], origin: "+input", removed: Array[1]}
firepad.js:1673 Uncaught Error: sendOperation() called with invalid operation.
render-math.js:277 22787.310s change Object {from: CodeMirror.Pos, to: CodeMirror.Pos, text: Array[1], origin: "+input", removed: Array[1]}
firepad.js:1673 Uncaught Error: sendOperation() called with invalid operation.
render-math.js:277 22787.911s change Object {from: CodeMirror.Pos, to: CodeMirror.Pos, text: Array[1], origin: "+input", removed: Array[1]}
firepad.js:1673 Uncaught Error: sendOperation() called with invalid operation.
render-math.js:277 22788.987s change Object {from: CodeMirror.Pos, to: CodeMirror.Pos, text: Array[1], origin: "+input", removed: Array[1]}
firepad.js:1673 Uncaught Error: sendOperation() called with invalid operation.
@anpc
Copy link
Author

anpc commented Apr 28, 2015

More important than fixing this particular sync problem would be (A) testing and/or (B) indication whenever edits don't get saved (for whatever reason) — #56

@cben cben added the bug label Apr 29, 2015
@cben
Copy link
Owner

cben commented Apr 29, 2015

Let's see the stack trace on the later error:

firepad.js:1673 Uncaught Error: sendOperation() called with invalid operation.
firepad.js:1673 assert
firepad.js:1430 firepad.FirebaseAdapter.FirebaseAdapter.sendOperation
firepad.js:2383 firepad.EditorClient.EditorClient.sendOperation
firepad.js:2087 firepad.Client.Synchronized.applyClient
firepad.js:2052 firepad.Client.Client.applyClient
firepad.js:2352 firepad.EditorClient.EditorClient.onChange
firepad.js:2275 EditorClient.editorAdapter.registerCallbacks.change
firepad.js:4293 firepad.RichTextCodeMirrorAdapter.RichTextCodeMirrorAdapter.trigger
firepad.js:4176 firepad.RichTextCodeMirrorAdapter.RichTextCodeMirrorAdapter.onChange
firepad.js:4370 obj.(anonymous function)
firepad.js:48 clazz.trigger
firepad.js:3510 firepad.RichTextCodeMirror.RichTextCodeMirror.onCodeMirrorChange_
firepad.js:4003 obj.(anonymous function)
codemirror.js:7507 CodeMirror.signal
codemirror.js:2302 endOperation_finish
codemirror.js:2193 endOperations
codemirror.js:2176 endOperation
codemirror.js:2650 readInput
codemirror.js:2736 (anonymous function)

This sounds somewhat like https://github.com/firebase/firepad/issues/101 [literal so GH won't spam that issue], which was fixed 1.5 years ago. It's probably not exactly it, and the root cause is probably the "same base length" error.
Anyway, if I have to debug Firepad, it's hard to defend my sticking with old firebase & firepad just because that firebase.js was accidentally open sourced (#4).
But I also don't want to upgrade stuff blindly. I need more tests.

@cben cben changed the title firepad errors, sync stopped working in one way saving/sync broken Apr 29, 2015
@cben
Copy link
Owner

cben commented Apr 29, 2015

This is horrific.
Opening a page from 2 tabs reliably breaks sync for me.

It's likely that silent edit loss has been going on for users for don't know how long :-(

firebase.js:43 Uncaught TypeError: Cannot read property 'line' of undefined
...
Uncaught Error: sendOperation() called with invalid operation.

@cben
Copy link
Owner

cben commented Apr 29, 2015

pushing prominent warning on top linking to this bug...

cben added a commit that referenced this issue Apr 29, 2015
@cben
Copy link
Owner

cben commented Apr 30, 2015

progress: Cannot read property 'line' of undefined is coming from my render-math.js and is probably the root cause.
(being my fault is much more comforting than being Firepad's fault :-))

Tip to future self: firebase-debug.js is really useful in that it dumps stack on "Uncaught" exceptions it catches.

cben added a commit to cben/CodeMirror-MathJax that referenced this issue Apr 30, 2015
Triggered on any change when line 0 contains a CM bookmark.

I wrote the lazy `found.line ?` without cosidering what if line == 0.
Very embarrasing.  I should be fired :-(

In Mathdown this caused cben/mathdown#85
because Firepad renders other users' cursors with a CM bookmark
=> editing the first line immediately broke other windows editing same line.

Bug originated in 475901c on 2014-10-21
cben referenced this issue in cben/CodeMirror-MathJax Apr 30, 2015
Don't remove foreign marked ranges (fixes #19).
cben added a commit to cben/CodeMirror-MathJax that referenced this issue Apr 30, 2015
…reak

other code handling CM events, such as Firepad as happened in
cben/mathdown#85.
cben added a commit that referenced this issue Apr 30, 2015
  > Fix `TypeError: Cannot read property 'line' of undefined` exception. Triggered on any change when line 0 contains a CM bookmark.
  > Catch all errors in our CM event handlers so (I hope) we can't ever break other code handling CM events, such as Firepad as happened in #85.
and another dumb bug from same commit:
  > Fix #19 correctly: clear only our markers, not only foreign ones!
@cben
Copy link
Owner

cben commented Apr 30, 2015

OK, conclusions:

This has been broken for half a year

caused cben/CodeMirror-MathJax@475901c where I made 2 drunk-idiot-level bugs

  • Editing first line of the document would AFAICT immediately break saving in any other browser instance having same page open simultaneously.
    • It seems that the window where you made the edit remains working, so if you didn't collaborate (nor had same page in multiple browser tabs) you were safe. But I don't entirely understand how my exception was breaking firepad and why it doesn't break the window where you made the edit — so can't be sure you were safe.
    • Just having your cursor in the first line when another browser instance is editing the page may have caused problems. I think not — the change has to include first line — but I'm not 100% sure.

@cben
Copy link
Owner

cben commented Apr 30, 2015

I've pushed a fix but I'm leaving this bug open (and the warning on top leading here) until I implement #56 to reliably warn any user whose specific saving doesn't work.

cben added a commit that referenced this issue May 19, 2015
Also initial "loading..." indication.

Not sure it's water-tight wrt bugs like #85.
cben added a commit that referenced this issue May 22, 2015
restore generic "Alpha quality" warning.
While #56 is not water-tight yet, the specific bug has been fixed.
This was referenced May 27, 2015
@cben
Copy link
Owner

cben commented Jun 3, 2015

Good news: reproducing the problem (for #56) proved quite hard. Firepad is much more stubborn than I expected in sending and receiving changes even when there are exceptions :-)
This means that not as much data has been lost.

  • Going back to the buggy CM-MJ wasn't enough; it seems firebase-debug.js not only reports but also tolerates exceptions better, and I only managed to reproduce with the non-debug firepad.js.
  • It's easy to start getting exceptions but that doesn't mean we're not saving!
  • The direct exception from clearOurMarksInRange -> rangesOverlap -> posCmp is TypeError: Cannot read property 'line' of undefined (Chrome) / a is undefined (Firefox). In itself, that's not a problem yet.
  • It takes some editing, apparently from both sides, to start getting Error: Both operations have to have the same base length from Firepad. This is in principle bad news but in practice, we're still saving!
  • Only when you see Error: sendOperation() called with invalid operation. firepad.js:1673:12 then you're really borked.
    • And even then, you're still probably receiving changes, only not sending yours.

cben added a commit that referenced this issue Jun 4, 2015
Periodically creates new Firepad.Headless, loads the text and compares
to current editor content.
Also checks Firebase's online status for better messages.

After a couple days of torturing, this seems to reliably detect
#85 borked firepad state and warn "Online but unsaved — BUG?".
Problems:
- False positives during typing (saw this) or incoming changes (theoretical).
  Needs debouncing after a period of quiet...
- Currently for easy testing runs the check *every second*!
  Should increase testSyncIntervalMs to 30-60sec before landing.
  - If only to avoid constant 1-5% CPU usage.
- Performance concerns: after a day of holding and torture-editing
  2 chrome tabs, things crawled to unusably slow stuck-after-every-press.
  Their combined memory usage grew to ~1.5GB! (plus ~1GB for their dev tools)
  And at some point CPU usage grew to 100%.
  - Near the end, I timed Headless getText to take >3 seconds!
    Initially it's ~1ms for existing Headless and ~230ms for new Headless!
    This explains 100% CPU - 1sec timer should be after prev timer finished
    execution, but alas getText() returns immediately.
  - AFAICT, snapshots were being taken so it's not op composition overhead.
  - It seems memory leak does not happen in current mathdown.net version
    (0f69d29).
@cben cben added this to the "beta" milestone Aug 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants