#include "probfd/preprocessing/end_component_decomposition.h"

Description

template<typename State, typename Action>
class probfd::preprocessing::EndComponentDecomposition< State, Action >

A builder class that implements end component decomposition.

An (non-trivial) end component is a sub-MDP $M = (S, E)$ where $S$ is a subset of states and $E : S \to 2^A $ is a mapping from states to a non-empty set of enabled actions. $M$ must satisfy:

The directed graph induced by $M$ is strongly connected.
$E(s) \subseteq A(s)$ for every state $s \in S$.
$M$ is closed under probabilistic branching. If $s \in S$, $a \in E(s)$ and $\Pr(t \mid s, a) > 0$, then $t \in S$.

A zero EC is an end component which only contains actions that have a cost of zero. Furthermore, a maximal end component (MEC) is an end component that is maximal with respect to set inclusion.

This algorithm computes all maximal zero ECs of the MDP in time O(m*n), where $m$ is the number of edges and $n$ is the number of nodes in the induced graph of the MDP. The returned quotient collapses all zero ECs into a single state.

The existence of zero ECs prevents ordinary algorithms based on value iteration from converging against the optimal value function, unless a pessimistic heuristic is used to initialize the value function. However, the end component decomposition quotient has no zero ECs, therefore standard value iteration algorithms converge with any initialization of the value function. Since every state in a zero EC has the same optimal state value, it easy to extendt the final potimal value function for the quotient to the optimal value function for the original MDP.

See also: algorithms::interval_iteration::IntervalIteration

Template Parameters

State	- The state type of the underlying state space.
Action	- The action type of the underlying state space.

Public Member Functions
std::unique_ptr< QSystem >	build_quotient_system (MDPType &mdp, const EvaluatorType *pruning_function, param_type< State > initial_state, double max_time=std::numeric_limits< double >::infinity())
	Build the quotient of the MDP with respect to the maximal end components.

Member Function Documentation

◆ build_quotient_system()

template<typename State , typename Action >

auto probfd::preprocessing::EndComponentDecomposition< State, Action >::build_quotient_system	(	MDPType &	mdp,
		const EvaluatorType *	pruning_function,
		param_type< State >	initial_state,
		double	max_time = std::numeric_limits<double>::infinity() )

Build the quotient of the MDP with respect to the maximal end components.

Only the fragment of the MDP that is reachable from the given initial state is considered.

Description

Public Member Functions

Member Function Documentation

◆ build_quotient_system()