Foundations of Artificial Intelligence (FAI) Group
divider line


Seminar: Trusted AI Planning (TAIP)


Basics. Seminar, 7 graded ECTS points.

The seminar will be run in a block format. There will be an initial meeting on Wednesday, October 22, 2025, 16:00-17:30. All student presentations will be given on Thursday, February 19, 2026.

All meetings will take place in room 3.06, Building E1 1. The seminar language is English throughout.

Supervisors for the seminar are Jörg Hoffmann and Daniel Höller. Additional feedback will be provided by Jan Eisenhut, Chaahat Jain, Songtuan Lin, and Marcel Vinzent.

Your task will be to read and understand a piece of research, to write a summary paper in your own words, to give a presentation, and to provide detailed feedback for the paper and presentation of a fellow student.

All email interaction must be pre-fixed with "[TAIP-25]" in the email subject.

No plagiarism. It is Ok (and encouraged!) to use web resources to further your understanding of your assigned topic. However, it is inadmissible to use pieces of such material for your summary paper or presentation. Any plagiarism will result in disqualification from the seminar. You are allowed to include pieces (like formal definitions, empirical results tables or figures) from the paper you are summarizing; however, you need to clearly and explicitly mark such material as being from the paper.


Content. AI Planning is the sub-area of AI concerned with complex action-choice problems. The seminar covers methods supporting trust in algorithmic solutions to such problems. This field of research is very recent, and we cover research conducted in the FAI group. A major concern are neural action policies, i.e., neural networks that map states to actions. While such policies can be very performant, they are fundamentally opaque and come without any guarantees. We cover methods for verifying, testing, and re-training such policies.


Prerequisites. Participants must have successfully completed either an edition of the Artificial Intelligence core course, or of the AI Planning specialized course.


Registration. Is via the central seminar registration system.


Grading. The final grading will be based, in this order of importance, on:


Summary Paper. For the summary paper, you must use this tex template. You are required to read at least 2 related papers, for the related work section. You are allowed to modify the section structure given in the template if, for whatever reason, this is more adequate for the work you are summarizing.

The seminar paper should be about 4 pages long (not counting the literature list, and in the double-column format of the template). This is a rough guideline, not a strict rule. If you need, say, 5-6 pages to do your paper justice then definitely do so.


Schedule and Deadlines (tentative!).


Topics. Each participant will be assigned one topic, most of which consists of one paper (while one includes two shortpapers). The overall amount and difficulty of the material associated with each topic is roughly balanced.

Note that each topic is associated with a mentee student (to whom you will provide feedback, see the following deadlines); and a mentor student (who will provide feedback to you, see the following deadlines). The mentee/mentor assignment will be a "cycle" through each of the topic areas as listed below: let k be the number of topics, then the mentor->mentee relation is i->i+1 and k->1. If you want to team up with someone specific, please do state that in your email.

Area 1: Open-Loop Verification (supervisor: Jörg Hoffmann)

  1. Guy Katz, Clark W. Barrett, David L. Dill, Kyle Julian, Mykel J. Kochenderfer: Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. CAV (1) 2017: 97-117. Paper: Area 1 Topic 1. Feedback giver: Marcel Vinzent.
  2. Haoze Wu, Aleksandar Zeljic, Guy Katz, Clark W. Barrett: Efficient Neural Network Analysis with Sum-of-Infeasibilities. TACAS (1) 2022: 143-163. Paper: Area 1 Topic 2. Feedback giver: Marcel Vinzent.
  3. Gagandeep Singh, Timon Gehr, Markus Püschel, Martin T. Vechev: An abstract domain for certifying neural networks. Proc. ACM Program. Lang. 3(POPL): 41:1-41:30 (2019). Paper: Area 1 Topic 2. Feedback giver: Marcel Vinzent.
  4. Laurens Devos, Lorenzo Cascioli, Jesse Davis: Robustness Verification of Multi-Class Tree Ensembles. AAAI 2024: 21019-21028. Paper: Area 1 Topic 4. Feedback giver: Marcel Vinzent.
Area 2: Closed-Loop Verification (supervisor: Daniel Höller)
  1. Marcel Vinzent, Marcel Steinmetz, Jörg Hoffmann: Neural Network Action Policy Verification via Predicate Abstraction. ICAPS 2022: 371-379. Paper: Area 2 Topic 1. Feedback giver: Chaahat Jain.
  2. Marcel Vinzent, Siddhant Sharma, Jörg Hoffmann: Neural Policy Safety Verification via Predicate Abstraction: CEGAR. AAAI 2023: 15188-15196. Paper: Area 2 Topic 2. Feedback giver: Chaahat Jain.
  3. Chaahat Jain, Lorenzo Cascioli, Laurens Devos, Marcel Vinzent, Marcel Steinmetz, Jesse Davis, Jörg Hoffmann: Safety Verification of Tree-Ensemble Policies via Predicate Abstraction. ECAI 2024: 1189-1197. Paper: Area 2 Topic 3. Feedback giver: Chaahat Jain.
  4. Edoardo Bacci, Mirco Giacobbe, David Parker: Verifying Reinforcement Learning up to Infinity. IJCAI 2021: 2154-2160. Paper: Area 2 Topic 4. Feedback giver: Chaahat Jain.
Area 3: Testing (supervisor: Daniel Höller)
  1. Hasan Ferit Eniser, Timo P. Gros, Valentin Wüstholz, Jörg Hoffmann, Maria Christakis: Metamorphic relations via relaxations: an approach to obtain oracles for action-policy testing. ISSTA 2022: 52-63. Paper: Area 3 Topic 1. Feedback giver: Jan Eisenhut.
  2. Jan Eisenhut, Álvaro Torralba, Maria Christakis, Jörg Hoffmann: Automatic Metamorphic Test Oracles for Action-Policy Testing. ICAPS 2023: 109-117. Paper: Area 3 Topic 2. Feedback giver: Jan Eisenhut.
  3. Maria Christakis, Hasan Ferit Eniser, Jörg Hoffmann, Adish Singla, Valentin Wüstholz: Specifying and Testing k-Safety Properties for Machine-Learning Models. IJCAI 2023: 4748-4757. Paper: Area 3 Topic 3. Feedback giver: Jan Eisenhut.
  4. C. Jain, D. Sherbakov, M. Vinzent, M. Steinmetz J. Davis, and J. Hoffmann, Policy Safety Testing in Non-Deterministic Planning: Fuzzing, Test Oracles, Fault Analysis, Proceedings of the 28th European Conference on Artificial Intelligence (ECAI'25), 2025. Feedback giver: Jan Eisenhut.
Area 4: DSMC, Re-Training, Repair (supervisor: Jörg Hoffmann)
  1. Timo P. Gros, Holger Hermanns, Jörg Hoffmann, Michaela Klauck, Marcel Steinmetz: Analyzing neural network behavior through deep statistical model checking. Int. J. Softw. Tools Technol. Transf. 25(3): 407-426 (2023). Paper: Area 4 Topic 1. Feedback giver: Songtuan Lin.
  2. Timo P. Gros, Daniel Höller, Jörg Hoffmann, Michaela Klauck, Hendrik Meerkamp, Verena Wolf: DSMC Evaluation Stages: Fostering Robust and Safe Behavior in Deep Reinforcement Learning. QEST 2021: 197-216. Paper: Area 4 Topic 2. Feedback giver: Songtuan Lin.
  3. J. Eisenhut, D. Fišer, I. Valera, J. Hoffmann, On Picking Good Policies: Leveraging Action-Policy Testing in Policy Training, Proceedings of the 35th International Conference on Automated Planning and Scheduling (ICAPS'25), 2025. and
    H. Eniser, S. Lin, A. Isychev, N. Müller, V. Wüstholz, I. Valera, J. Hoffmann, and M. Christakis, Using Action-Policy Testing in RL to Reduce the Number of Bugs, Proceedings of the 18th Annual Symposium on Combinatorial Search (SoCS'25), 2025. Feedback giver: Songtuan Lin.
  4. L. Cascioli, C., Jain, M. Steinmetz, J. Davis, and J. Hoffmann, Safety Debugging of Tree-Ensemble Action Policies in AI Planning: From Fault Detection to Fault Fixing, under review for AAAI'26. Feedback giver: Songtuan Lin.