FAI Group Teaching Winter 14-15: Heuristic Search for MDPs

Foundations of Artificial Intelligence (FAI) Group

Seminar: Heuristic Search for MDPs

Basics. Seminar, 7 graded ECTS points.

The seminar will be run in a block format. There will be an initial meeting on Wednesday, October 22, 16:15--17:45. All student presentations will be given on a single day after the end of term. A detailed schedule is given below.

All meetings will take place in room 3.06, Building E1 1. The seminar language is English throughout.

The seminar supervisors are Prof. Dr. Jörg Hoffmann, Dr. Peter Kissmann, and Álvaro Torralba.

Your task will be to read and understand a piece of research, to write a summary paper in your own words, to give a presentation, and to provide detailed feedback for the paper and presentation of a fellow student.

All email interaction must be pre-fixed with "[MDP14]" in the email subject.

No plagiarism. It is Ok (and encouraged!) to use web resources to further your understanding of your assigned topic. However, it is inadmissible to use pieces of such material for your summary paper or presentation. Any plagiarism will result in disqualification from the seminar.

Content. Heuristic search is a state of the art method for finding solution paths in large deterministic transition systems. But what about probabilistic transition systems? In a Markov decision process (MDP), every action may have several possible outcomes, each occuring with a given probability. A solution then is no longer a path, but rather a policy, i.e. a state--action mapping, providing an action for any state that may be visited while executing the policy. The seminar covers (1) the basic mechanisms that have been developed for applying heuristic search to this much more general setting, (2) the generation of heuristic functions via determinization, and (3) the use of deterministic approximations for action choice in online probabilistic planning.

Prerequisites. Participants should have successfully completed an introductory course in Artificial Intelligence, and should be familiar with the area of planning to the extent of the material covered in the Artificial Intelligence course.

It is not a necessary prerequisite to have completed the Automatic Planning course, although that is of course an advantage.

Registration. Seminar, 7 graded ECTS points. The seminar has 8 participation slots for students. Registration for the seminar will be open from October 1 until October 15 (midnight). Please do not try to register ahead of time; we will only consider applications reaching us within the given time window!

To apply for registration, send an email to Peter Kissmann and Álvaro Torralba. In the email, give a brief description of your relevant background. In particular, say whether you got a BSc and from which university, and describe previous lectures/seminars you completed in the areas of Artificial Intelligence, Planning, and MDPs. Say a few words regarding why you are interested in participating in the seminar.

You will be notified by email on October 22, informing you whether or not you are registered.

Grading. The final grading will be based on:

The quality of the feedback you provide to other students.
The quality of your summary write-up.
The ability to stick to deadlines.
The quality of your final presentation.
The interaction during the final presenations.

Summary Paper. For the summary paper, you must use this tex template. Note in particular that you are required to read at least 2 related papers, for the related work section.The seminar paper should be about 4 pages long (not counting the literature list, and in the double-column format of the template). This is a rough guideline, not a strict rule. If you need, say, 5-6 pages to do your paper justice then definitely do so.

Schedule and Deadlines.

October 1--20: Apply for registration.
October 21: Registration notification.
October 22, 16:15--17:45: Initial meeting; brief explanation of the 8 topics; opportunity for you to ask questions.
October 27: Send a ranked top-5 list of the topics you would like to take, by email to Peter Kissmann and Álvaro Torralba. That is, send something like "7,3,8,5,1".
October 29: Receive your assignment: A topic (each of which comes with fixed supervisor respectively supervisors, see the topic list below); a mentee student (to whom you will provide feedback, see the following deadlines); and a mentor student (who will provide feedback to you, see the following deadlines). The mentee/mentor assignment will be a "cycle" through the respective 3 topics.
Read the material associated with your topic carefully, and prepare an initial version of your summary paper, using the tex template given above.
November 15--30: Make an appointment with your supervisors to discuss your paper: Give a brief oral summary of the paper (15--20 minutes); ask questions; do not bring the summary paper (yet).
December 8: Send your summary paper to your mentor student (cc supervisors).
December 22: Send feedback regarding the summary paper to your mentee student (cc supervisors).
January 15: Send revised summary paper to your mentor student (cc supervisors).
January 20, 16:15--17:45: Meeting with everyone: Hints on how to give a good presentation. You can download the slides here.
January 21: Send feedback regarding the revised summary paper to your mentee student (cc supervisors).
January 27: Send presentation slides to mentor student (cc supervisors).
February 3: Send feedback regarding the presentation slides to your mentee student (cc supervisors).
February 10: Send revised presentation slides to mentor student (cc supervisors).
February 15: Send feedback regarding the revised presentation slides to your mentee student (cc supervisors).
February 17: Send summary paper and presentation slides by email to Bärbel Zitzmann, (last name) (at) cs (dot) uni-saarland (dot) de.
February 18: Give a presentation (25 minutes talk, plus 15 minutes discussion) in the block seminar. Attendance to all talks is required. For each talk we will build a panel of "discussants": your mentee and mentor, and one additional student. Each discussant is supposed to ask at least one non-trivial question.

Topics. Each participant will be assigned one topic, each of which may consists of one paper (sometimes a part thereof). The overall amount and difficulty of the material associated with each topic is roughly balanced. The topics are distributed across the three different areas mentioned above.

Area 1: Heuristic Search Algorithms for MDPs. (Supervisor: )

LAO*: Heuristic search for MPDs. Paper: Area 1 Topic 1 (excluding section 4)
Labeled Real-Time Dynamic Programming. Paper: Area 1 Topic 2
Bounded Real-Time Dynamic Programming. Paper: Area 1 Topic 3 (excluding section 3)

Area 2: Classical Planning Heuristics for MDPs. (Supervisor: )

Mini-GPT. Paper: Area 2 Topic 1
Basis functions. Paper: Area 2 Topic 2 (excluding sections 4 and 5)
Dead-end detection. Paper: Area 2 Topic 3 (excluding sections 3 and 4)

Area 3: Online Planning. (Supervisor: )

Limited Replanning. Paper: Area 1 Topic 1
Determinization in Hindsight. Paper: Area 1 Topic 2
Finite-horizon lookahead. Paper: Area 1 Topic 3