Diederik M. Roijers
University of Oxford
Vrije Universiteit Brussel
Shimon Whiteson
University of Oxford
SYNTHESIS LECTURES ON ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING #34
ABSTRACT
Many real-world decision problems have multiple objectives. For example, when choosing a medical treatment plan, we want to maximize the efficacy of the treatment, but also minimize the side effects. These objectives typically conflict, e.g., we can often increase the efficacy of the treatment, but at the cost of more severe side effects. In this book, we outline how to deal with multiple objectives in decision-theoretic planning and reinforcement learning algorithms. To illustrate this, we employ the popular problem classes of multi-objective Markov decision processes (MOMDPs) and multi-objective coordination graphs (MO-CoGs).
First, we discuss different use cases for multi-objective decision making, and why they often necessitate explicitly multi-objective algorithms. We advocate a utility-based approach to multi-objective decision making, i.e., that what constitutes an optimal solution to a multi-objective decision problem should be derived from the available information about user utility. We show how different assumptions about user utility and what types of policies are allowed lead to different solution concepts, which we outline in a taxonomy of multi-objective decision problems.
Second, we show how to create new methods for multi-objective decision making using existing single-objective methods as a basis. Focusing on planning, we describe two ways to creating multi-objective algorithms: in the inner loop approach, the inner workings of a single-objective method are adapted to work with multi-objective solution concepts; in the outer loop approach, a wrapper is created around a single-objective method that solves the multi-objective problem as a series of single-objective problems. After discussing the creation of such methods for the planning setting, we discuss how these approaches apply to the learning setting.
Next, we discuss three promising application domains for multi-objective decision making algorithms: energy, health, and infrastructure and transportation. Finally, we conclude by outlining important open problems and promising future directions.
KEYWORDS
artificial intelligence, decision theory, decision support systems, probabilistic planning, multi-agent systems, multi-objective optimization, machine learning
Contents
2 Multi-Objective Decision Problems
2.2 Multi-Objective Coordination
2.2.1 Single-Objective Coordination Graphs
2.2.2 Multi-Objective Coordination Graphs
2.3 Multi-Objective Markov Decision Processes
2.3.1 Single-Objective Markov Decision Processes
2.3.2 Multi-Objective Markov Decision Processes
3.1 Critical Factors
3.1.1 Single vs. Multiple Policies
3.1.2 Linear vs. Monotonically Increasing Scalarization Functions
3.1.3 Deterministic vs. Stochastic Policies
3.2 Solution Concepts
3.2.1 Case #1: Linear Scalarization and a Single Policy
3.2.2 Case #2: Linear Scalarization and Multiple Policies
3.2.3 Case #3: Monotonically Increasing Scalarization and a Single Deterministic Policy
3.2.4 Case #4: Monotonically Increasing Scalarization and a Single Stochastic Policy
3.2.5 Case #5: Monotonically Increasing Scalarization and Multiple Deterministic Policies
3.2.6 Case #6: Monotonically Increasing Scalarization and Multiple Stochastic Policies
3.3 Implications for MO-CoGs
3.4 Approximate Solution Concepts
3.5 Beyond the Taxonomy
4.1 Inner Loop Approach
4.1.1 A Simple MO-CoG
4.1.2 Finding a PCS
4.1.3 Finding a CCS
4.1.4 Design Considerations
4.2 Inner Loop Planning for MO-CoGs
4.2.1 Variable Elimination
4.2.2 Transforming the MO-CoG
4.2.3 Multi-Objective Variable Elimination
4.2.4 Comparing PMOVE and CMOVE
4.3 Inner Loop Planning for MOMDPs
4.3.1 Value Iteration
4.3.2 Multi-Objective Value Iteration
4.3.3 Pareto vs. Convex Value Iteration
5.1 Outer Loop Approach
5.2 Scalarized Value Functions
5.2.1 The Relationship with POMDPs
5.3 Optimistic Linear Support
5.4 Analysis
5.5 Approximate Single-Objective Solvers
5.6