AI 24/25 Project Software
Documentation for the AI 24/25 course programming project software
|
#include "probfd/preprocessing/end_component_decomposition.h"
A builder class that implements end component decomposition.
An (non-trivial) end component is a sub-MDP \(M = (S, E)\) where \(S\) is a subset of states and \(E : S \to 2^A \) is a mapping from states to a non-empty set of enabled actions. \(M\) must satisfy:
A zero EC is an end component which only contains actions that have a cost of zero. Furthermore, a maximal end component (MEC) is an end component that is maximal with respect to set inclusion.
This algorithm computes all maximal zero ECs of the MDP in time O(m*n), where $m$ is the number of edges and $n$ is the number of nodes in the induced graph of the MDP. The returned quotient collapses all zero ECs into a single state.
The existence of zero ECs prevents ordinary algorithms based on value iteration from converging against the optimal value function, unless a pessimistic heuristic is used to initialize the value function. However, the end component decomposition quotient has no zero ECs, therefore standard value iteration algorithms converge with any initialization of the value function. Since every state in a zero EC has the same optimal state value, it easy to extendt the final potimal value function for the quotient to the optimal value function for the original MDP.
State | - The state type of the underlying state space. |
Action | - The action type of the underlying state space. |
Public Member Functions | |
std::unique_ptr< QSystem > | build_quotient_system (MDPType &mdp, const EvaluatorType *pruning_function, param_type< State > initial_state, double max_time=std::numeric_limits< double >::infinity()) |
Build the quotient of the MDP with respect to the maximal end components. | |
auto probfd::preprocessing::EndComponentDecomposition< State, Action >::build_quotient_system | ( | MDPType & | mdp, |
const EvaluatorType * | pruning_function, | ||
param_type< State > | initial_state, | ||
double | max_time = std::numeric_limits<double>::infinity() ) |