I'm a computer scientist passionate about building thoughtful, well-crafted software at the intersection of web development, machine learning, and data visualization. I enjoy turning complex ideas into clear, usable systems: whether that's an interactive interface, a model that explains itself, or a visualization that makes patterns visible at a glance.
Currently, I research uncertainty quantification in large language models, studying how these models and methods work while benchmarking different approaches. My focus is on understanding, evaluating, and improving the reliability and interpretability of machine learning systems.
Previously, I studied computer science alongside teaching and education, which shaped how I approach both research and development. I've worked on a range of projects spanning web applications, ML pipelines, and educational tools, always with a strong emphasis on clarity, accessibility, and knowledge sharing.
Outside of work, I’m usually hiking through forests or mountains, building things by hand, experimenting with code and visualizations, or writing poetry. I'm queer, naturally curious, and happiest when learning something new or creating something tangible.
Curriculum Vitae
2024–Present
Researcher (Artificial Intelligence)
Helmholtz-Zentrum Dresden-Rossendorf
Researching uncertainty quantification in large language models. Prepared and taught an LLM workshop at the HZDR HPC Fall School 2025.
2021–2025
M.Sc. in Computer Science
Technische Universität Dresden
Master’s thesis: “Uncertainty Estimation of Large Language Model Replies in Natural Sciences”
2019–2025
State Examination in Education
Technische Universität Dresden
Specialization: Upper secondary education (Mathematics and Computer Science)
2015–2019
B.Sc. in Computer Science
Westsächsische Hochschule Zwickau
Bachelor’s thesis: “Comparison of Angular and Vue.js for introducing component-based frontend frameworks in an academic course”
Posters
-
EurIPS 2025
Reproducibility by Design: A Modular Framework for Benchmarking Evolving Probabilistic AI Systems
Müller, P., Steinbach, P.
This poster presents a modular benchmarking framework designed to ensure reproducibility and transparency when evaluating probabilistic and resource-intensive AI systems, with a focus on uncertainty estimation for large language models. The framework decouples model execution from evaluation logic, enabling consistent, reusable, and longitudinal benchmarking as models and APIs evolve.
-
HAICON 2025
Uncertainty Estimation of Large Language Model Replies in Natural Sciences
Müller, P., Popovič, N., Färber, M., Steinbach, P.
This poster presents a qualitative and quantitative evaluation of uncertainty metrics for large language models in scientific question answering. It assesses a range of theoretical approaches and benchmarks a subset of metrics for reliability, highlighting the strengths and limitations of token-level, verbalized, and consistency-based measures. The poster was awarded second place for Best Poster in an audience vote among 200 entries.
Tech & Tools
Python
Jupyter
vLLM
JavaScript
TypeScript
Node.js
Electron
Vue.js
Svelte
Raspberry Pi
Git
Java
C++
3D Printing