-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finally multithreading! #3
Comments
Hi @Akababa, this looks great, I'm going over it now. In the mean time, I noticed your @Zeta36 I'm wondering if this is the crucial bug that appeared since the DeepMind-style board representation. When you flip the board to orient the features to the perspective of the current player... Then the final NN map onto the policy vector must be flipped back as well!? Furthermore it would be necessary to "preemptively" flip visit count information before feeding it into the neural network, e.g. during the |
Also, why did you remove the manual annealing of the learning rate? |
Hey @benediamond , thanks for commenting! I didn't train for 100,000 steps anyway (the first time lr changes) so it doesn't really matter, I was just experimenting with different optimizers. On the scale of our testing Google's annealing isn't really applicable. |
yes, I think I did the flipping stuff correctly here, but would really appreciate if you could take a quick look to see if it checks out from your point of view. If you have code you'd like a sanity check on too I'd be happy to help out :) |
Well, this is really a problem we didn't take into account. Certainty it may be the cause of the converge failing. |
Yes. I can't believe we didn't think of this. @Akababa, kudos! I'll be looking through your code making sense of everything. I'll let you know how things work. |
Yeah that's always a worry in the back of my mind (hence the paranoid asserts). I'm a little confused by the conversation though, is there already a bug found in my code? Or is it a previous one before my implementation? @benediamond Thank you! Please feel free to write some test cases and sanity checks. |
Doesn't the DeepMind input actually use an extra plane to encode the side to move? The main reason I did this was for fun, and also it might make the network train faster, as I believe it's strictly better than having the color plane and using this transformation to augment the training data. |
Yes, they did. You can see my current approach here. Here is another quick question. It appears that you clear the move tables at the beginning of each call to Also, what do you mean by "losing MCTS nodes...!"? |
But why is there a need to flip the policy if you are feeding in the side to move? |
Hmm, I see what you're saying. But that'd be much harder for the network, no? The entire mapping from the convolved input stack to the policy vector will have to be re-learned from scratch for black, in a new way that is totally an arbitrary scrambling of the first. At that point, there is no reason to place the side-to-move on top of the stack, orient the board from the player's perspective, etc... Right? |
That might be true especially at the beginning, before the model has the chance to learn the rules of chess. However I think we are doing something similar with the flattened policy output layer anyway. Google's paper does mention that final result (between flattened policy and huge stack of one-hot layers) was the same but training was slightly slower with a "compressed" format, which for us with our 0 TPUs probably means we won't see significant results from scratch for a while. One thing I considered doing is having two 64-unit FC outputs for to and from square (and maybe ignore underpromotions for now); it might be a little easier for the network to use. BUT I don't know if this would output a sensible probability distribution with regard to softmax and ranking chess moves. |
By the way, do you know what the "alternative" to the flat approach is? I can't figure out what the "non-flat" approach they're referring to is. |
Yeah I agree that's unclear. I don't even know how they came up with 4629 possible moves. |
4672 comes from their 73x8x8 move representation, as described in the arxiv paper. They also mention that they tried various approaches, which all worked well. |
Yeah my impression is anything we understand won't matter anyway :) All we can do is ensure the inputs and outputs are correct and pray for the best. BTW are you able to access their nature paper? If not, I got it from my university and can send it to you if you want. |
On line 21 of your |
Try checking the new branch, I removed that part and optimized a lot of other stuff. |
Didn't see that, thanks. |
@benediamond sorry I didn't see your other comment. I think Python passes the reference to the [] to self.prediction queue so it's all good. You can uncomment the |
Yes, Indeed I deleted it because I figured that out myself just after posting! Thanks. |
By the way, I'm brainstorming a list of ways to fix the draws by repetition thing, hopefully we can figure this one out. |
Hi @Akababa, the one thing that seemed to effect this most strongly for me was the I've also experimented with a slowly (exponentially) decaying Using either of these two, I could essentially eliminate draws by repetition. |
Thanks, if that works it's a much nicer solution than the stuff I came up with. Did you let tau=e^{-0.01*turn} ? |
Yes, essentially. I replaced the parameter tau = np.power(tau_decay_rate, turn) |
My tensorflow was broken by the latest CUDA update, so it'll be a bit before I can get working again. |
What version? I'm on cuda 8 and cudnn 6 |
I've got CUDA 9.1 and cuDNN 7.0.5. still no luck. |
I just used the old versions on the TF site. Are the new ones faster? |
As my machine runs CUDA 9(.1), TF w/ GPU won't work out of the box. rather than attempt to downgrade, I just built from source. That proved to be a good idea, until recently. |
As for speed, I'm not sure, but using the new versions couldn't hurt, I think. |
Multithread is not parallel task in python due to GIL, is it? |
No, unfortunately not :( But locks are still faster than asyncio with event loop. |
@benediamond @Zeta36
After many hours of hopeless debugging I discovered locks which are amazing. The overall speedup on my machine is quite a lot, I would say at least 2x.
That being said, I haven't tested it fully and the code is almost completely rewritten/refactored by now, so please feel free to use it and tell me if I missed anything :)
https://github.com/Akababa/chess-alpha-zero/blob/opts/src/chess_zero/agent/player_chess.py
TODO:
The text was updated successfully, but these errors were encountered: