In recent years, the most powerful family of methods in adversarial search is the kind that combines search with deep reinforcement learning. A deep neural network can be trained to predict the value of a state – roughly speaking to predict the chances of winning the game from a state, to predict the expected score, etc.

Although this has been attempted in smaller ways before, the method that really proved this approach to be powerful, was AlphaGo [alphago] from DeepMind: the agent that defeated the grandmaster Lee Sedol in the game of Go. Other variants of the approach have been developed since then, including AlphaZero, which does not require pre-training on a dataset of human games and can be applied to other games [alphazero], or MuZero, which does not even need to be told the rules of the game: it can learn them from experience [muzero].

Literature

  1. [alphago] Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M. and Dieleman, S., 2016. Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), pp.484-489.
  2. [alphazero] Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T. and Lillicrap, T., 2017. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815.
  3. [muzero] Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T. and Lillicrap, T., 2020. Mastering atari, go, chess and shogi by planning with a learned model. Nature, 588(7839), pp.604-609.