Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User manual #1

Open
bhardwahnitish19 opened this issue May 31, 2021 · 4 comments
Open

User manual #1

bhardwahnitish19 opened this issue May 31, 2021 · 4 comments

Comments

@bhardwahnitish19
Copy link

Hi,

Thanks for starting this awesome work. This will really help in using RAFT with Java server. I am curious to know how to use this? Or what's the best way to leverage this great library in custom servers which requires some state to be transferred and saved across the nodes? How to change/modify the state/message that needs to be persisted on nodes?

Could you please help me to understand this further based on the above queries or maybe we can work together to enhance this further or define all of the above?

@nicktindall
Copy link
Owner

Hi!

Thanks for checking it out :) it's actually just a pet project that I chip away at when I've got some downtime, but I do intend for it to be useful at some point.

The plan is to get log compaction working then focus on the API and documentation. At this stage, with no log compaction I can't see it being useful for anything in production. Also I haven't written a socket communication layer for it yet, though I imagine I'll implement that with Netty or similar.

Keep an eye on it anyway, it is actively developed, but sporadically. I'll leave this issue open and update it when I make progress towards it.

@jocull
Copy link

jocull commented Nov 12, 2022

I’ve started a similar project (Raft in Java) and have encountered way more pain along the way than I expected.

I guess all of your work here over multiple years shows that this is still a hard problem to code for even though the paper makes it seem so well described and simple?

Really early on I’ve had to fight distracting battles like tamping down garbage collection times to keep them from interfering with followers and triggering elections. It’s not what I expected at all just to get the basics stable.

@nicktindall
Copy link
Owner

nicktindall commented Nov 15, 2022

I’ve started a similar project (Raft in Java) and have encountered way more pain along the way than I expected.

I guess all of your work here over multiple years shows that this is still a hard problem to code for even though the paper makes it seem so well described and simple?

Really early on I’ve had to fight distracting battles like tamping down garbage collection times to keep them from interfering with followers and triggering elections. It’s not what I expected at all just to get the basics stable.

Hey @jocull,

Thanks for dropping by. I think your assertion is correct, it's definitely not a trivial thing to implement, and there are many little implementation decisions that can break the protocol, or make it less robust, despite faithfulness to the spec.

Don't be too discouraged by the elapsed time of the work I have done on it, it's been very sporadic and my desire to engage with it fluctuates wildly based on where my work and home life are at at the given moment 😄

I recently got enthusiastic about it again and have a branch in progress implementing log compaction. This last chunk of work has been the biggest grind so far, because log compaction is only really touched upon in the paper (compared to how well everything else is specified), and you slowly realise how many places you baked in the assumption that the head of the log is immutable.

I still enjoy working on it and I still intend to finish it one day.

The wish list beyond log compaction:

  • API/documentation (this issue)
  • Screw down the object creation
    • I've had the same issue you mentioned, where you eventually get some big pauses that interfere with the cluster. I wanted to leave that till last and try and avoid it where I can because I like using all the language niceties like Optional, iterators etc. and I don't know how much of that is taken care of by escape analysis and how much of it is strictly off limits.
  • Look at the threading model
    • There are too many locks in my implementation, and I think many of them can go with a nicer thread model.
  • Networking layer
  • Test it in jepsen

Drop us a link to your implementation if it's public I'd love to see.

@jocull
Copy link

jocull commented Nov 17, 2022

Here's the bits that I had 😄 https://github.com/jocull/distributed-cache

I had originally shopped around and looked at some of the top Java implementation on the Raft home page (SOFAJRaft, Apache Ratis) but wasn't very happy with those implementations. The paper is so short, I figured why not try writing my own.

It was a lot harder than I expected, so I wound up here looking for additional Java implementations to compare with. Yours and mine are not so different and I've also discovered https://microraft.io/ ( https://github.com/MicroRaft/MicroRaft ) which is similar.

To deal with the threading and locking problems, a sort of Actor model per node seemed to work OK for me (and MicroRaft is similar). It was not trivial to retrofit that later though 😄

Good luck in your development! I'm looking forward to following along!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants