Planning is one of the early application domains of artificial intelligence. Firstly, there has been a lot of work on how to represent planning problems in a standard form so that the same representation of a problem can be used with multiple planners. There is a number of planning languages – an early example is the representation used in STRIPS: a planning system developed in the early seventies [strips]. More modern examples include:

  • PDDL (Planning Domain Definition Language) [pddl, pddl3.1];
  • PPDDL (Probabilistic PDDL) for stochastic planning problems [ppddl];
  • RDDL (Relational Dynamic Influence Diagram Language) a more flexible language for stochastic planning problems related to dynamic bayesian networks and Markov decision processes [rddl];
  • HDDL (Hierarchical Domain Definition Language) for hierarchical planning problems [hddl].

There is a number of planners available for some of these languages (often submitted as entries to planning competitions) while others are not as well-supported. 

Approaches to planning come under several different families [auto_planning]:

  • State-space search;
  • Plan-space search and hierarchical task network (HTN) planning;
  • Graphplan;

While state-space planning is probably the most conceptually simple approach, other methods often support more advanced techniques such as creation of parametric plans, hierarchical planning and more.

Furthermore, there are connections between planning and many other fields such as control (e.g. in model-predictive control) and various areas of machine learning. One such area is that of reinforcement learning, where model-based agents use a model and planning in various ways to get better estimates of the effects that taking a particular action is likely to have in a particular state. Conversely, you can also view reinforcement learning as a component in planners in the sense that deep reinforcement learning can automatically construct value functions from observed data: these can then be used in place of a heuristic function to estimate how good particular actions or states are without having to plan all the way to a terminal state. This approach has been used e.g. in DeepMind’s automatic players AlphaGo, AlphaZero and others [alphago, alphazero].

Literature

  1. [strips] Fikes, R.E. and Nilsson, N.J., 1971. STRIPS: A new approach to the application of theorem proving to problem solving. Artificial intelligence, 2(3-4), pp.189-208.
  2. [pddl] Aeronautiques, C., Howe, A., Knoblock, C., McDermott, I.D., Ram, A., Veloso, M., Weld, D., SRI, D.W., Barrett, A., Christianson, D. and Friedman, M., 1998. PDDL| The Planning Domain Definition Language.
  3. [pddl3.1] Kovacs, D.L., 2011. BNF definition of PDDL 3.1. Unpublished manuscript from the IPC-2011 website.
  4. [ppddl] Younes, H.L. and Littman, M.L., 2004. PPDDL1.0: The language for the probabilistic part of IPC-4. In Proc. International Planning Competition.
  5. [rddl] Sanner, S., 2010. Relational dynamic influence diagram language (rddl): Language description. Unpublished ms. Australian National University, 32, p.27.
  6. [hddl] Höller, D., Behnke, G., Bercher, P., Biundo, S., Fiorino, H., Pellier, D. and Alford, R., 2019. HDDL–A Language to Describe Hierarchical Planning Problems. arXiv preprint arXiv:1911.05499.
  7. [auto_planning] Ghallab, M., Nau, D. and Traverso, P., 2004. Automated Planning: theory and practice. Elsevier.
  8. [alphago] Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M. and Dieleman, S., 2016. Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), pp.484-489.
  9. [alphazero] Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T. and Lillicrap, T., 2017. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815.