AI 24/25 Project Software
Documentation for the AI 24/25 course programming project software
Loading...
Searching...
No Matches
epsilon_greedy_open_list.h
1#ifndef OPEN_LISTS_EPSILON_GREEDY_OPEN_LIST_H
2#define OPEN_LISTS_EPSILON_GREEDY_OPEN_LIST_H
3
4#include "downward/open_list_factory.h"
5
6/*
7 Epsilon-greedy open list based on Valenzano et al. (ICAPS 2014).
8
9 With probability epsilon the next entry is selected uniformly
10 randomly, otherwise the minimum entry is chosen. While the original
11 implementation by Valenzano et al. is based on buckets (personal
12 communication with the authors), this implementation stores entries
13 in a heap. It is usually desirable to let open lists break ties in
14 FIFO order. When using a heap, this can be achieved without using
15 significantly more time by assigning increasing IDs to new entries
16 and using the IDs as tiebreakers for entries with identical values.
17 On the other hand, FIFO tiebreaking induces a substantial worst-case
18 runtime penalty for bucket-based implementations. In the table below
19 we list the worst-case time complexities for the discussed
20 implementation strategies.
21
22 n: number of entries
23 m: number of buckets
24
25 Buckets Buckets (no FIFO) Heap
26 Insert entry O(log(m)) O(log(m)) O(log(n))
27 Remove random entry O(m + n) O(m) O(log(n))
28 Remove minimum entry O(log(m)) O(log(m)) O(log(n))
29
30 These results assume that the buckets are implemented as deques and
31 are stored in a sorted dictionary, mapping from evaluator values to
32 buckets. For inserting a new entry and removing the minimum entry the
33 bucket-based implementations need to find the correct bucket
34 (O(log(m))) and can then push or pop from one end of the deque
35 (O(1)). For returning a random entry, bucket-based implementations
36 need to loop over all buckets (O(m)) to find the one that contains
37 the randomly selected entry. If FIFO ordering is ignored, one can use
38 swap-and-pop to remove the entry in constant time. Otherwise, the
39 removal is linear in the number of entries in the bucket (O(n), since
40 there could be only one bucket).
41*/
42
43namespace epsilon_greedy_open_list {
44class EpsilonGreedyOpenListFactory : public OpenListFactory {
45 std::shared_ptr<Evaluator> eval;
46 double epsilon;
47 int random_seed;
48 bool pref_only;
49
50public:
51 EpsilonGreedyOpenListFactory(
52 const std::shared_ptr<Evaluator>& eval,
53 double epsilon,
54 int random_seed,
55 bool pref_only);
56
57 virtual std::unique_ptr<StateOpenList> create_state_open_list() override;
58 virtual std::unique_ptr<EdgeOpenList> create_edge_open_list() override;
59};
60} // namespace epsilon_greedy_open_list
61
62#endif