About Me

I'm a computer scientist passionate about building thoughtful, well-crafted software at the intersection of web development, machine learning, and data visualization. I enjoy turning complex ideas into clear, usable systems: whether that's an interactive interface, a model that explains itself, or a visualization that makes patterns visible at a glance.

Currently, I research uncertainty quantification in large language models, studying how these models and methods work while benchmarking different approaches. My focus is on understanding, evaluating, and improving the reliability and interpretability of machine learning systems.

Previously, I studied computer science alongside teaching and education, which shaped how I approach both research and development. I've worked on a range of projects spanning web applications, ML pipelines, and educational tools, always with a strong emphasis on clarity, accessibility, and knowledge sharing.

Outside of work, I’m usually hiking through forests or mountains, building things by hand, experimenting with code and visualizations, or writing poetry. I'm queer, naturally curious, and happiest when learning something new or creating something tangible.

Curriculum Vitae

2024–Present

Researcher (Artificial Intelligence)

Helmholtz-Zentrum Dresden-Rossendorf

Conducted research on uncertainty quantification in large language models, including an extensive literature review and a systematic comparison of multiple approaches. I evaluated methods across a wide range of models with different architectures, providers, sizes, and variants (base, instruct, and reasoning). I gained hands-on experience with vLLM and developed a framework for large-scale LLM benchmarking with a strong focus on efficiency, modularity, and reproducibility. I presented my work in the form of two posters at two international conferences and submitted a research paper to ICML. In addition, I shared my expertise by preparing and teaching a one-day workshop on large language models, covering their architecture, prompting techniques, and programmatic use for classification tasks in research at the HZDR HPC Fall School 2025.

2021–2025

M.Sc. in Computer Science

Technische Universität Dresden

Master’s thesis: “Uncertainty Estimation of Large Language Model Replies in Natural Sciences”

2019–2025

State Examination in Education

Technische Universität Dresden

Specialization: Upper secondary education (Mathematics and Computer Science)

2015–2019

B.Sc. in Computer Science

Westsächsische Hochschule Zwickau

Bachelor’s thesis: “Comparison of Angular and Vue.js for introducing component-based frontend frameworks in an academic course”

Posters

EurIPS 2025

Reproducibility by Design: A Modular Framework for Benchmarking Evolving Probabilistic AI Systems

Müller, P., Steinbach, P.

This poster presents a modular benchmarking framework designed to ensure reproducibility and transparency when evaluating probabilistic and resource-intensive AI systems, with a focus on uncertainty estimation for large language models. The framework decouples model execution from evaluation logic, enabling consistent, reusable, and longitudinal benchmarking as models and APIs evolve.

Download Poster Abstract Conference Listing
HAICON 2025

Uncertainty Estimation of Large Language Model Replies in Natural Sciences

Müller, P., Popovič, N., Färber, M., Steinbach, P.

This poster presents a qualitative and quantitative evaluation of uncertainty metrics for large language models in scientific question answering. It assesses a range of theoretical approaches and benchmarks a subset of metrics for reliability, highlighting the strengths and limitations of token-level, verbalized, and consistency-based measures. The poster was awarded second place for Best Poster in an audience vote among 200 entries.

Download Poster Abstract Award Announcement

Tech & Tools

Python Jupyter vLLM JavaScript TypeScript Node.js Electron Vue.js Svelte Raspberry Pi Git Java C++ 3D Printing