Sequential Decision Making

Sequential decision making is a class of decision making problems, where a sequence of decisions is made with each of them having some influence on the state of the system. The goal of sequential decision making is similar to that of one-time decision making problems in that they both aim to maximize some measure of utility or reward. In the case of sequential decision making, however, the aim is to maximize the long-term utility (reward). The problem of sequential decision making: if we think of decisions as discrete actions, is equivalent to the problem of discrete control. If we allow actions to be continuous, the problem is equivalent to continuous control.

Long-term vs. Immediate Utility

Maximizing the long-term utility is usually equivalent to maximizing the immediate utility at each time step. The agent will generally need to forgo high-utility decisions at some steps in order to maximize the overall long-term utility. To give an instance: one might need to sacrifice a figure in a chess game in order to win.

Machine Learning in Sequential Decision Making

There is a number of approaches to the problem that rely in one way or another on artificial intelligence and machine learning methods. Apart from approaches that rely on explicit knowledge representation, machine learning is often used to train models that can be used in conjunction with optimization in approaches such as model predictive control (MPC).

Importantly, there is a separate class of machine learning methods that focuses on the problem of sequential decision making specifically: reinforcement learning. The aim of reinforcement learning is specifically to maximize long-term rewards by selecting actions (discrete or continuous) over time. The area of reinforcement learning is closely related to that of bandit problems in that reinforcement learning methods also needs to perform exploration and to account for the exploration vs. exploitation trade-off in some way. In that context, one can think of reinforcement learning simply as a bandit augmented with state.

Literature

[suttonRL] Sutton, R.S. and Barto, A.G., 2018. Reinforcement learning: An introduction. MIT press. 2nd ed.
[aima] Russell, S.J. and Norvig, P., 2010. Artificial Intelligence-A Modern Approach, Third International Edition.

Related

AI Task

AI Method

Ancestors