Citations of my papers can be found at Google Scholar.


  • M. Roehrl, V. Brandstetter, M. Tokic, T. Runkler, S. Obermayer: Modeling System Dynamics with Physics-Informed Neural Networks Based on Lagrangian Mechanics. To appear in Proceedings of the 21st IFAC World Congress (IFAC 2020), 2020.
  • M. Tokic, A. von Beuningen, C. Tietz, and H.-G. Zimmermann: Handling missing data in recurrent neural networks for air quality forecasting. To appear in Proceedings of the 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2020), 2020.
  • M. Bischoff, M. Tokic. Method and device for the computer-aided determination of control parameters for favourable handling of a technical system. Patent: [WO2020002072, EP3587046]


  • M. Tokic, C. Tietz, S. G. Schnitzer and H.-G. Zimmermann: Air Quality Forecast with
    Recurrent Neural Networks. In Proceedings of the 39th International Symposium on Forecasting, Thessaloniki, Greece, 16/6/2019. [ pdf ]
  • H. Busch, H. Essafi, M. Geipel, O. Heyer, T. Kloss, M. Tokic. Event-based temporal synchronization. Patent: EP3521792
  • S. Depeweg,  H. Frank, R. Grothmann, F. Rudolph, V. Sterzing,  M. Tokic, S. Vogl. Method for predicting a switching time of a set of signals of signalling facility. Patent: EP3438946.


  • D. Hein, S. Depeweg, M. Tokic, S. Udluft, A. Hentschel, T. A. Runkler, and V. Sterzing. A Benchmark Environment Motivated by Industrial Control Problems. IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2017. [ http ]
  • D. Hein, S. Udluft, M. Tokic, A. Hentschel, T. Runkler, and V. Sterzing. Batch Reinforcement Learning on the Industrial Benchmark: First Experiences. In Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN 2017), pages 4214-4221. IEEE Press. [ http ]


  • D. Hein, A. Hentschel, V. Sterzing, M. Tokic, and S. Udluft. Introduction to the “Industrial Benchmark”. CoRR, arXiv:1610.03793 [cs.LG], pages 1-11. 2016. [ pdf sourcecode ]


  • W. Hauptmann, A. Hentschel, C. Otte, V. Sterzing, M. Tokic, S. Udluft, and H.-G. Zimmermann. ALICE: Autonomes Lernen in komplexen Umgebungen. Siemens AG, Munich, 2015. [ http ]


  • M. Tokic. Reinforcement Learning mit adaptiver Steuerung von Exploration und Exploitation. PhD thesis, Universität Ulm, Institut für Neuroinformatik, 2013. [ http ]
  • M. Tokic. Reinforcement Learning: Psychologische und neurobiologische Aspekte. Künstliche Intelligenz, 27(3):213-219, 2013. [ pdf ]
  • M. Tokic, F. Schwenker, and G. Palm. Meta-learning of exploration/exploitation parameters with replacing eligibility traces. In Partially Supervised Learning, volume 8183 of Lecture Notes in Artificial Intelligence, pages 68-79. Springer Berlin / Heidelberg, 2013. [ pdf ]


  • P. Ertle, M. Tokic, R. Cubek, H. Voos, and D. Söffker. Towards learning of safety knowledge from human demonstrations. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2012), pages 5394-5399, Vilamoura, Algarve, Portugal, 2012. IEEE Press. Nominated (1 of 4) for the “New Technology Foundation Award for Entertainment Robots and Systems”. [  pdf ]
  • M. Tokic, P. Ertle, G. Palm, D. Söffker, and H. Voos. Robust exploration/exploitation trade-offs in safety-critical applications. In Proceedings of the 8th International Symposium on Fault Detection, Supervision and Safety of Technical Processes, pages 660-665, Mexico City, Mexico, Aug. 2012. IFAC. [ pdf ]
  • P. Ertle, M. Tokic, B. Tobias, M. Ebel, H. Voos, and D. Söffker. Conceptual design of a dynamic risk-assessment server for autonomous robots. In Proceedings of the 7th German Conference on Robotics, pages 250-254. VDE Verlag, May 2012. [ pdf ]
  • M. Tokic and G. Palm. Adaptive exploration using stochastic neurons. In A. Villa, W. Duch, P. Érdi, F. Masulli, and G. Palm, editors, Artificial Neural Networks and Machine Learning – ICANN 2012, volume 7553 of Lecture Notes in Computer Science, pages 42-49. Springer Berlin / Heidelberg, 2012. [ pdf ]
  • M. Tokic and G. Palm. Gradient algorithms for exploration/exploitation trade-offs: Global and local variants. In N. Mana, F. Schwenker, and E. Trentin, editors, Artificial Neural Networks in Pattern Recognition, volume 7477 of Lecture Notes in Computer Science, pages 60-71. Springer Berlin / Heidelberg, 2012. [ pdf ]
  • M. Tokic and H. Bou Ammar. Teaching reinforcement learning using a physical robot. In Proceedings of the Workshop on Teaching Machine Learning at the 29th International Conference on Machine Learning, pages 1-4, Edinburgh, UK, 2012. [ pdf ]


  • M. Tokic and G. Palm. Value-difference based exploration: Adaptive control between epsilon-greedy and softmax. In J. Bach and S. Edelkamp, editors, KI 2011: Advances in Artificial Intelligence, volume 7006 of Lecture Notes in Artificial Intelligence, pages 335-346. Springer Berlin / Heidelberg, 2011. The original publication is available at [ pdf ]
  • S. Montresor, J. Kay, M. Tokic, and J. Summerton. Work in progress: Programming in a confined space – a case study in porting modern robot software to an antique platform. In Proceedings of the 41st ASEE/IEEE Frontiers in Education Conference, pages T3H-1-T3H-3, Rapid City, SD, USA, 2011. IEEE Press. [ pdf ]


  • M. Tokic, A. Usadel, J. Fessler, and W. Ertel. On an educational approach to behavior learning for robots. AT&P Journal Plus, 2010(2):103-108, 2010.
  • M. Tokic, A. Usadel, J. Fessler, and W. Ertel. On an educational approach to behavior learning for robots. In Proceedings of the 1st International Conference on Robotics in Education, pages 171-176, Bratislava, Slovak Republic, 2010. Slovak University of Technology in Bratislava. [pdf ]
  • M. Tokic. Adaptive ε-greedy exploration in reinforcement learning based on value differences. In R. Dillmann, J. Beyerer, U. Hanebeck, and T. Schultz, editors, KI 2010: Advances in Artificial Intelligence, volume 6359 of Lecture Notes in Artificial Intelligence, pages 203-210. Springer Berlin / Heidelberg, 2010. [ pdf ]


  • M. Tokic, J. Fessler, and W. Ertel. The crawler, a class room demonstrator for reinforcement learning. In C. Lane and H. Guesgen, editors, Proceedings of the 22th International Florida Artificial Intelligence Research Society Conference FLAIRS’09, pages 160-165, Menlo Park, California, USA, 2009. AAAI Press. [ pdf ]
  • W. Ertel, M. Schneider, R. Cubek, and M. Tokic. The Teaching-Box: A universal robot learning framework. In Proceedings of the 14th International Conference on Advanced Robotics ICAR’09., pages 1-6, 2009. [ pdf ]


  • M. Tokic, W. Ertel, H. Radtke, J. Akmal, and W. Krökel. Reinforcement learning on a simple real walking robot. In Proceedings of the 29th Annual German Conference on Artificial Intelligence, pages 1-3, Bremen, Germany, 2006.