Finally multithreading! #3

Akababa · 2017-12-17T06:36:19Z

@benediamond @Zeta36
After many hours of hopeless debugging I discovered locks which are amazing. The overall speedup on my machine is quite a lot, I would say at least 2x.
That being said, I haven't tested it fully and the code is almost completely rewritten/refactored by now, so please feel free to use it and tell me if I missed anything :)
https://github.com/Akababa/chess-alpha-zero/blob/opts/src/chess_zero/agent/player_chess.py

TODO:

Testing
Ctrl+C stop it
C++ implementation, looks lock-bound

benediamond · 2017-12-17T15:21:38Z

Hi @Akababa, this looks great, I'm going over it now.

In the mean time, I noticed your flip_policy step. Could you say more about this?

@Zeta36 I'm wondering if this is the crucial bug that appeared since the DeepMind-style board representation. When you flip the board to orient the features to the perspective of the current player... Then the final NN map onto the policy vector must be flipped back as well!? Furthermore it would be necessary to "preemptively" flip visit count information before feeding it into the neural network, e.g. during the convert_to_training_data step.

benediamond · 2017-12-17T15:43:17Z

Also, why did you remove the manual annealing of the learning rate?

Akababa · 2017-12-17T15:54:46Z

Hey @benediamond , thanks for commenting!
Btw I'm pushing more optimizations to https://github.com/Akababa/chess-alpha-zero/blob/opts/src/chess_zero/agent/player_chess.py now. It looks like it's working well.

I didn't train for 100,000 steps anyway (the first time lr changes) so it doesn't really matter, I was just experimenting with different optimizers. On the scale of our testing Google's annealing isn't really applicable.

Akababa · 2017-12-17T15:57:54Z

yes, I think I did the flipping stuff correctly here, but would really appreciate if you could take a quick look to see if it checks out from your point of view.

If you have code you'd like a sanity check on too I'd be happy to help out :)

Zeta36 · 2017-12-17T16:01:59Z

@benediamond:

When you flip the board to orient the features to the perspective of the current player... Then the final NN map onto the policy vector must be flipped back as well!?

Well, this is really a problem we didn't take into account. Certainty it may be the cause of the converge failing.

benediamond · 2017-12-17T16:10:34Z

Yes. I can't believe we didn't think of this. @Akababa, kudos! I'll be looking through your code making sense of everything. I'll let you know how things work.

Akababa · 2017-12-17T16:12:46Z

Yeah that's always a worry in the back of my mind (hence the paranoid asserts). I'm a little confused by the conversation though, is there already a bug found in my code? Or is it a previous one before my implementation?

@benediamond Thank you! Please feel free to write some test cases and sanity checks.

benediamond · 2017-12-17T16:14:35Z

@Akababa The point is that I had developed a "DeepMind-style" feature plane input on my own, but I hadn't realized (as you did) that the policy vector needed to be flipped for black. @Zeta36 and I were wondering why it didn't converge. I'll be updating it accordingly as soon as possible.

Akababa · 2017-12-17T16:19:19Z

Doesn't the DeepMind input actually use an extra plane to encode the side to move?

The main reason I did this was for fun, and also it might make the network train faster, as I believe it's strictly better than having the color plane and using this transformation to augment the training data.

benediamond · 2017-12-17T16:26:12Z

Yes, they did. You can see my current approach here.

Here is another quick question. It appears that you clear the move tables at the beginning of each call to action. Yet isn't this contrary to the DeepMind approach, where, as they say, after each move the non-chosen portion of the tree is discarded but the chosen one is kept? Here, we will have to build visit counts from scratch each time a new move is chosen. Previously, memory was released only at the end of the game (when self.white and self.black are reassigned in self_play.py).

Also, what do you mean by "losing MCTS nodes...!"?

Akababa · 2017-12-17T16:39:32Z

But why is there a need to flip the policy if you are feeding in the side to move?
Yes that was before I read that part of the paper, but even then I'm not sure how move counts from previous transpositions would affect the table. In any case I'm mostly doing this as a "functional" approach to make results and bugs reproducible and make things easier to think about for now.

benediamond · 2017-12-17T16:53:49Z

Hmm, I see what you're saying. But that'd be much harder for the network, no? The entire mapping from the convolved input stack to the policy vector will have to be re-learned from scratch for black, in a new way that is totally an arbitrary scrambling of the first. At that point, there is no reason to place the side-to-move on top of the stack, orient the board from the player's perspective, etc... Right?

Akababa · 2017-12-17T17:01:05Z

That might be true especially at the beginning, before the model has the chance to learn the rules of chess.

However I think we are doing something similar with the flattened policy output layer anyway. Google's paper does mention that final result (between flattened policy and huge stack of one-hot layers) was the same but training was slightly slower with a "compressed" format, which for us with our 0 TPUs probably means we won't see significant results from scratch for a while.

One thing I considered doing is having two 64-unit FC outputs for to and from square (and maybe ignore underpromotions for now); it might be a little easier for the network to use. BUT I don't know if this would output a sensible probability distribution with regard to softmax and ranking chess moves.

benediamond · 2017-12-17T17:16:09Z

By the way, do you know what the "alternative" to the flat approach is? I can't figure out what the "non-flat" approach they're referring to is.

Akababa · 2017-12-17T17:21:35Z

Yeah I agree that's unclear. I don't even know how they came up with 4629 possible moves.

benediamond · 2017-12-17T17:22:56Z

4672 comes from their 73x8x8 move representation, as described in the arxiv paper. They also mention that they tried various approaches, which all worked well.

Akababa · 2017-12-17T17:24:14Z

Yeah my impression is anything we understand won't matter anyway :) All we can do is ensure the inputs and outputs are correct and pray for the best.

BTW are you able to access their nature paper? If not, I got it from my university and can send it to you if you want.

benediamond · 2017-12-17T18:00:26Z

On line 21 of your player_chess.py, you reference asyncio despite having deleted the import. Is this intentional?

Akababa · 2017-12-17T18:07:38Z

Try checking the new branch, I removed that part and optimized a lot of other stuff.

benediamond · 2017-12-17T18:20:26Z

Didn't see that, thanks.

Akababa · 2017-12-17T18:47:16Z

@benediamond sorry I didn't see your other comment. I think Python passes the reference to the [] to self.prediction queue so it's all good. You can uncomment the
#logger.debug(f"predicting {len(item_list)} items")
to verify for yourself.

benediamond · 2017-12-17T19:36:55Z

Yes, Indeed I deleted it because I figured that out myself just after posting! Thanks.

Akababa · 2017-12-17T19:44:09Z

By the way, I'm brainstorming a list of ways to fix the draws by repetition thing, hopefully we can figure this one out.

benediamond · 2017-12-17T19:51:14Z

Hi @Akababa, the one thing that seemed to effect this most strongly for me was the change_tau_turn. I would first try setting this value to a very large number (1000, etc.), so that tau never drops.

I've also experimented with a slowly (exponentially) decaying tau.

Using either of these two, I could essentially eliminate draws by repetition.

Akababa · 2017-12-17T19:56:18Z

Thanks, if that works it's a much nicer solution than the stuff I came up with. Did you let tau=e^{-0.01*turn} ?

benediamond · 2017-12-17T20:01:23Z

Yes, essentially. I replaced the parameter change_tau_turn with tau_decay_rate. 0.99 was a good value (very close to e^{-0.01} lol). Then set

tau = np.power(tau_decay_rate, turn)

benediamond · 2017-12-18T00:46:56Z

My tensorflow was broken by the latest CUDA update, so it'll be a bit before I can get working again.

Akababa · 2017-12-18T01:30:57Z

What version? I'm on cuda 8 and cudnn 6

benediamond · 2017-12-18T01:41:25Z

I've got CUDA 9.1 and cuDNN 7.0.5. still no luck.

Akababa · 2017-12-18T01:43:23Z

I just used the old versions on the TF site. Are the new ones faster?

benediamond · 2017-12-18T01:46:17Z

As my machine runs CUDA 9(.1), TF w/ GPU won't work out of the box. rather than attempt to downgrade, I just built from source. That proved to be a good idea, until recently.

benediamond · 2017-12-18T01:47:22Z

As for speed, I'm not sure, but using the new versions couldn't hurt, I think.

apollo-time · 2017-12-25T13:12:14Z

Multithread is not parallel task in python due to GIL, is it?

Akababa · 2017-12-25T15:12:55Z

No, unfortunately not :( But locks are still faster than asyncio with event loop.

Akababa mentioned this issue Dec 17, 2017

Lets make it practical Zeta36/chess-alpha-zero#32

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finally multithreading! #3

Finally multithreading! #3

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 •

edited

Loading

Akababa commented Dec 17, 2017

Zeta36 commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017 •

edited

Loading

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 18, 2017

Akababa commented Dec 18, 2017

benediamond commented Dec 18, 2017

Akababa commented Dec 18, 2017

benediamond commented Dec 18, 2017

benediamond commented Dec 18, 2017

apollo-time commented Dec 25, 2017

Akababa commented Dec 25, 2017

Finally multithreading! #3

Finally multithreading! #3

Comments

Akababa commented Dec 17, 2017 • edited Loading

TODO:

benediamond commented Dec 17, 2017 • edited Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 • edited Loading

Akababa commented Dec 17, 2017

Zeta36 commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 • edited Loading

benediamond commented Dec 17, 2017 • edited Loading

Akababa commented Dec 17, 2017 • edited Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 • edited Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 • edited Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017 • edited Loading

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017

Akababa commented Dec 17, 2017

benediamond commented Dec 17, 2017 • edited Loading

benediamond commented Dec 18, 2017

Akababa commented Dec 18, 2017

benediamond commented Dec 18, 2017

Akababa commented Dec 18, 2017

benediamond commented Dec 18, 2017

benediamond commented Dec 18, 2017

apollo-time commented Dec 25, 2017

Akababa commented Dec 25, 2017

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017 •

edited

Loading

Akababa commented Dec 17, 2017 •

edited

Loading

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017 •

edited

Loading

Akababa commented Dec 17, 2017 •

edited

Loading

Akababa commented Dec 17, 2017 •

edited

Loading

Akababa commented Dec 17, 2017 •

edited

Loading

Akababa commented Dec 17, 2017 •

edited

Loading

benediamond commented Dec 17, 2017 •

edited

Loading