Karl Tuyls, Francesco Delle Fave, Joris Mooij, Reyhan Aydoğan, and many others.
Diederik M. Roijers and Shimon Whiteson
April 2017
Table of Abbreviations
Abbreviation | Full Name | Location |
AOLS | approximate optimistic linear support | Algorithm 5.10, Section 5.5 |
CCS | convex coverage set | Definition 3.7, Section 3.2.2 |
CH | convex hull | Definition 3.6, Section 3.2.2 |
CHVI | convex hull value iteration | Section 4.3.2 |
CLS | Cheng’s linear support | Section 5.3 |
CMOVE | multi-objective variable elimination | Section 4.2.3 |
CoG | coordination graph | Definition 2.4, Section 2.2.1 |
CS | coverage set | Definition 3.5, Section 3.2 |
f | scalarization function | Definition 1.1, Section 1.1 |
MDP | Markov decision process | Definition 2.6, Section 2.3.1 |
MO-CoG | multi-objective coordination graph | Definition 2.5, Section 2.2.2 |
MODP | multi-objective decision problem | Definition 2.2, Section 2.1 |
MOMDP | multi-objective Markov decision process | Definition 2.8, Section 2.3.2 |
MORL | multi-objective reinforcement learning | Chapter 6 |
MOVE | multi-objective variable elimination | Algorithm 4.5, Section 4.2.3 |
MOVI | multi-objective value iteration | Section 4.3.2 |
OLS | optimistic linear support | Algorithm 5.8, Section 5.3 |
OLS-R | optimistic linear support with reuse | Algorithm 5.11, Section 5.6 |
PMOVI | Pareto multi-objective value iteration | Section 4.3.2 |
PCS | Pareto coverage set | Definition 3.11, Section 3.2.4 |
PMOVE | Pareto multi-objective variable elimination | Section 4.2.3 |
POMDP | partially observable Markov decision process | Section 5.2.1 |
PF | Pareto front | Definition 3.10, Section 3.2.4 |
SODP | single-objective decision problem | Definition 2.1, Section 2.1 |
U | undominated set | Definition 3.4, Section 3.2 |
VE | variable elimination | Algorithm 4.4, Section 4.2.1 |
VELS | variable elimination linear support | Section 5.7 |
VI | value iteration | Section 4.3.1 |
Vπ | value vector of a policy π | Definition 2.2, Section 2.1 |
Π | a set of allowed policies | Definition 2.1, Section 2.1 |
≻P | Pareto dominance relation | Definition 3.3, Section 3.1.2 |
CHAPTER 1
Introduction
Many real-world decision problems are so complex that they cannot be solved by hand. In such cases, autonomous agents that reason about these problems automatically can provide the necessary support for human decision makers. An agent is “anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors” [Russell et al., 1995]. An artificial agent is typically a computer program—possibly embedded in specific hardware—that takes actions in an environment that changes as a result of these actions. Autonomous agents can act without human control or intervention, on a user’s behalf[Franklin and Graesser, 1997].
Artificial autonomous agents can assist us in many ways. For example, agents can control manufacturing machines to produce products for a company [Monostori et al., 2006, Van Moergestel, 2014], drive a car in place of a human [Guizzo, 2011], trade goods or services on markets [Ketter et al., 2013, Pardoe, 2011], and help ensure security [Tambe, 2011]. As such,