Martin Pawelczyk

📍 Währinger Str. 29, Office 4.37
Vienna, Austria

👋 Short bio: I am an Assistant Professor for Responsible AI at the University of Vienna. Previously, I was a postdoctoral scholar at Harvard University, advised by Himabindu Lakkaraju and Seth Neel. I received my PhD from the University of Tübingen, advised by Gjergji Kasneci. During that time, I was also a research associate at Harvard University, the Max-Planck Institute for Security & Privacy and an intern at JP Morgan AI Research where I worked with the Explainable AI team on explaining automated trading models.

🧭 Research: My research interests lie within the area of AI safety and Data-centric AI. Specifically, I develop machine learning techniques as well as evaluation frameworks to improve the safety, interpretability, privacy, and reasoning capabilities of predictive and generative models.

My work addresses fundamental questions such as:

Privacy and Unlearning: How do we build privacy and unlearning frameworks that better balance privacy and utility?
Data Curation: What influence does data have on the safety of AI models?
Interpretability: How can we build interpretable and accurate models to assist in human decision-making?
Evaluation: How can we devise large scale evaluations that tests state-of-the-art ML methods effectively?

💡 Motivation: Richard Sutton’s Bitter Lesson posits that scaling data and compute is the most reliable path to advanced artificial intelligence. While this accurately describes the current trajectory of foundation models, it obscures critical challenges in how we curate and utilize data. First, many real-world applications - such as edge computing - require smaller, highly efficient models. Here, the challenge is not scaling, but selecting the right data. Second, even when model size is unconstrained, scaling hits a roadblock due to the limited amount of semantically diverse, high-quality data in web-scale corpora. Finally, training indiscriminately on the entire web introduces severe risks, such as the regurgitation of private information and the reproduction of harmful content. Our overarching goal is to fundamentally understand how data drives model behavior at scale, and to develop novel methodologies to safely and effectively steer AI systems through principled data curation.

Recruiting & Collaborations

I am constantly seeking highly motivated and curious students from diverse backgrounds who are passionate about building, breaking, and understanding machine learning systems. My group fosters a collaborative and supportive environment where you can develop cutting-edge research and make a significant impact.

PhD Positions

We welcome applications from prospective PhD students who possess:

A Master's degree (or equivalent) or close to graduating, ideally in CS, Math, Statistics, or ML.
A strong foundational understanding of mathematics.
Proficient coding skills or a keen ambition to cultivate them.
Fluency in English (written and spoken).

Internships & Collaborations

Grad Students: Year-round internship positions available. Email with CV and subject "Internship Position (Graduate Student)".

Undergrad/Masters: Email with CV, transcripts, and subject "Interested in Collaboration (Student)".

Researchers: Open to collaboration on Data-centric AI and AI safety. Subject "Interested in Collaboration".

Interested in working with me? Please fill out this brief form.

Recent News

📖

Apr'26: Paper accepted at ACL'26: Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models.

🤝

Mar'26: Happy to serve as an Area Chair for NeurIPS'26.

📖

Mar'26: New preprint: Don’t Trust Stubborn Neighbors: A Security Framework for Agentic Networks.

🚀

Feb'26: Started as an Assistant Professor in Responsible AI at the University of Vienna.

🤝

Jan'26: Happy to serve as an Area Chair for KDD'26.

🎤

Jan'26: Keynote on Machine Unlearning at the Huawei Tech Summit in Helsinki.

📖

Jan'26: Paper accepted at ICLR'26: Train Once, Answer All: Many Pretraining Experiments for the Cost of One.

🎤

Sep'25: Talk on 'Verifiable Data Attributions without Breaking the Bank' at the Sorbonne Winter School on Causal AI and XAI in Paris.

📖

Sep'25: Paper accepted at NeurIPS'25: Efficiently Verifiable Proofs of Data Attribution.

🏆

June'25: Spotlight Talks at ICML'25 CFAgentic & MOSS workshops: Weak-to-Strong Trustworthiness in LLMs.

📖

Jan'25: Paper accepted at ICLR'25: Machine Unlearning Fails to Remove Data Poisoning Attacks.

🛠️

Dec'24: Workshop accepted at ICLR'25: Trust in LLMs & LLM Applications.

🎤

July'24: Presenting In-Context Unlearning at ICML'24 in Vienna.

🏆

June'24: Spotlight Talk at ICML'24 GenLaw workshop: Machine Unlearning Fails to Remove Data Poisoning Attacks.

📖

May'24: Paper accepted at ICML'24: In-Context Unlearning: Language Models as Few Shot Unlearners.

Selected Publications

For the full list of publications, please refer to my Google Scholar page. (*) denotes equal contribution with alphabetical order.

Don’t Trust Stubborn Neighbors: A Security Framework for Agentic Networks

Preprint, *equal student contribution, ^†equal contribution among senior authors

S. Abedini*, S. Mavali*, L. Schönherr, M. Pawelczyk^† , R. Burkholz^†

Agents Reasoning Safety

Train Once, Answer All: Many Pretraining Experiments for the Cost of One

ICLR 2026

S. Bordt, M. Pawelczyk

Data Influence Memorization Privacy Evaluation
Efficiently Verifiable Proofs of Data Attribution

NeurIPS 2025

A. Karchmer, S. Neel, M. Pawelczyk

Data Influence XAI Evaluation
Machine Unlearning Fails to Remove Data Poisoning Attacks

ICLR 2025, 🏆 Spotlight @ ICML'24 GenLaw Workshop

M. Pawelczyk, J. Z. Di, A. Sekhari, G. Kautam, S. Neel

Data Unlearning Privacy Evaluation
In-Context Unlearning: Language Models as Few Shot Unlearners

ICML 2024, ^†equal contribution among senior authors

M. Pawelczyk, S. Neel^†, H. Lakkaraju^†

Data Unlearning Privacy Evaluation
Deep Neural Networks and Tabular Data

IEEE Transactions on Neural Networks and Learning Systems 2023

V Borisov, T Leemann, K Sessler, J Haug, M. Pawelczyk, G. Kasneci

Evaluation
Language Models are Realistic Tabular Data Generators

ICLR 2023, *equal contribution

V. Borisov*, K. Sessler*, T. Leemann, M. Pawelczyk, G. Kasneci

Privacy
Gaussian Membership Inference Privacy

NeurIPS 2023, *equal contribution

T. Leemann*, M. Pawelczyk*, G. Kasneci

Privacy Memorization
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse

ICLR 2023

M. Pawelczyk, T. Datta, J. v. d. Heuvel, G. Kasneci, H. Lakkaraju

XAI
OpenXAI: Towards a Transparent Evaluation of Model Explanations

NeurIPS 2022

C. Agarwal, S. Krishna, E.Saxena, M. Pawelczyk, N. Johnson, I. Puri, M. Zitnik, H.Lakkaraju

XAI Evaluation
A Benchmark for Algorithmic Recourse and Counterfactual Explanation Algorithms

NeurIPS 2021, 🏆 Spotlight @ ICML'22 Algorithmic Recourse Workshop

M. Pawelczyk, S. Bielawski, J. v. d. Heuvel, T. Richter, G. Kasneci

XAI Evaluation
Learning Model-agnostic Counterfactual Explanations for Tabular Data

WWWW 2020

M. Pawelczyk, K. Broelemann, G. Kasneci

XAI