Foundations of Artificial Intelligence (FAI) Group
divider line


Seminar: Heuristic Search for MDPs


Basics. Seminar, 7 graded ECTS points.

The seminar will be run in a block format. There will be an initial meeting on Wednesday, October 22, 16:15--17:45. All student presentations will be given on a single day after the end of term. A detailed schedule is given below.

All meetings will take place in room 3.06, Building E1 1. The seminar language is English throughout.

The seminar supervisors are Prof. Dr. Jörg Hoffmann, Dr. Peter Kissmann, and Álvaro Torralba.

Your task will be to read and understand a piece of research, to write a summary paper in your own words, to give a presentation, and to provide detailed feedback for the paper and presentation of a fellow student.

All email interaction must be pre-fixed with "[MDP14]" in the email subject.

No plagiarism. It is Ok (and encouraged!) to use web resources to further your understanding of your assigned topic. However, it is inadmissible to use pieces of such material for your summary paper or presentation. Any plagiarism will result in disqualification from the seminar.


Content. Heuristic search is a state of the art method for finding solution paths in large deterministic transition systems. But what about probabilistic transition systems? In a Markov decision process (MDP), every action may have several possible outcomes, each occuring with a given probability. A solution then is no longer a path, but rather a policy, i.e. a state--action mapping, providing an action for any state that may be visited while executing the policy. The seminar covers (1) the basic mechanisms that have been developed for applying heuristic search to this much more general setting, (2) the generation of heuristic functions via determinization, and (3) the use of deterministic approximations for action choice in online probabilistic planning.


Prerequisites. Participants should have successfully completed an introductory course in Artificial Intelligence, and should be familiar with the area of planning to the extent of the material covered in the Artificial Intelligence course.

It is not a necessary prerequisite to have completed the Automatic Planning course, although that is of course an advantage.


Registration. Seminar, 7 graded ECTS points. The seminar has 8 participation slots for students. Registration for the seminar will be open from October 1 until October 15 (midnight). Please do not try to register ahead of time; we will only consider applications reaching us within the given time window!

To apply for registration, send an email to Peter Kissmann and Álvaro Torralba. In the email, give a brief description of your relevant background. In particular, say whether you got a BSc and from which university, and describe previous lectures/seminars you completed in the areas of Artificial Intelligence, Planning, and MDPs. Say a few words regarding why you are interested in participating in the seminar.

You will be notified by email on October 22, informing you whether or not you are registered.


Grading. The final grading will be based on:


Summary Paper. For the summary paper, you must use this tex template. Note in particular that you are required to read at least 2 related papers, for the related work section.The seminar paper should be about 4 pages long (not counting the literature list, and in the double-column format of the template). This is a rough guideline, not a strict rule. If you need, say, 5-6 pages to do your paper justice then definitely do so.


Schedule and Deadlines.


Topics. Each participant will be assigned one topic, each of which may consists of one paper (sometimes a part thereof). The overall amount and difficulty of the material associated with each topic is roughly balanced. The topics are distributed across the three different areas mentioned above.

Area 1: Heuristic Search Algorithms for MDPs. (Supervisor: )

  1. LAO*: Heuristic search for MPDs. Paper: Area 1 Topic 1 (excluding section 4)
  2. Labeled Real-Time Dynamic Programming. Paper: Area 1 Topic 2
  3. Bounded Real-Time Dynamic Programming. Paper: Area 1 Topic 3 (excluding section 3)

Area 2: Classical Planning Heuristics for MDPs. (Supervisor: )

  1. Mini-GPT. Paper: Area 2 Topic 1
  2. Basis functions. Paper: Area 2 Topic 2 (excluding sections 4 and 5)
  3. Dead-end detection. Paper: Area 2 Topic 3 (excluding sections 3 and 4)

Area 3: Online Planning. (Supervisor: )

  1. Limited Replanning. Paper: Area 1 Topic 1
  2. Determinization in Hindsight. Paper: Area 1 Topic 2
  3. Finite-horizon lookahead. Paper: Area 1 Topic 3