About me

My Picture

I am a postdoctoral researcher in the Explainable Machine Learning at TUM and Helmholtz Munich, led by Prof. Zeynep Akata. My research spans Explainable AI and Mechanistic Interpretability for computer vision and natural language processing, with a focus on developing principled methods to better understand and improve deep neural networks.

Before my current position, I completed a Ph.D. in Computer Science (summa cum laude) at TU Berlin under the supervision of Prof. Marina Höhne and Prof. Klaus-Robert Müller. My thesis, “Explaining representations in Deep Neural Networks,” focused on Explainable AI and the mechanistic interpretability of computer vision models.

Prior to my Ph.D., I earned an M.Sc. (cum laude) in Data Science and Engineering from TU Berlin and TU Eindhoven, and a B.Sc. in Applied Mathematics and Computer Science from Saint Petersburg State University.

News

01.08.25 🧑‍🔬 Started Posdoctoral position at Explainable Machine Learning group at Technical Univarsity of Munich.
21.07.25 🎉 Defended PhD thesis with distinction (summa cum laude)!
18.06.25 📄 New preprint — “Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework”
16.06.25 🎙️ Gave a talk “GLOBE: Global Explanation of Deep Neural Networks” at Marburg University
11.06.25 🎉 Submitted PhD thesis!
24.12.24 🏆 Dilyara Bareeva has been awarded the Rolf Niedermeier Prize 2024 for her outstanding thesis, which I had the privilege to supervise.
07.10.24 🎙️ Gave a talk “Explainable AI: From Local to Global” at Potsdam Graduate School’s AI for Academia program.
26.09.24 📄 “CoSy: Evaluating Textual Explanations of Neurons” was accepted at NeurIPS 2024 conference.
14.09.24 📩 Became a reviewer for the Transactions of Machine Learning Research (TMLR) journal.
26.07.24 📄 “CoSy: Evaluating Textual Explanations of Neurons” and “Manipulating Feature Visualizations with Gradient Slingshots” were presented at ICML 2024 Mechanistic Interpretability Workshop.
18.07.24 🎙️ Chaired the Special Track at XAI-2024 conference on “Global and Concept-Based Explainability” in Valletta, Malta.
10.07.24 📚 Joined the Programm Committee of the 3rd IEEE Conference on Secure and Trustworthy Machine Learning (SaTML 2025).
04.06.24 📄 New preprint — “CoSy: Evaluating Textual Explanations of Neurons”
07.02.24 🎙️ Gave a talk “Intorduction to Explainable AI: how do we explain Deep Neural Networks” at BIFOLD Graduate School
23.01.24 📄 New preprint — “Manipulating Feature Visualizations with Gradient Slingshots”
16.01.24 🗓️ Our “Concept-Based global Explainability” track was accepted to XAI-2024
10.01.24 🎙️ Invited talk at BLISS Berlin about our NeurIPS 2023 paper “Labeling Neural Representations with Inverse Recognition”
14.12.23 📄 “Labeling Neural Representations with Inverse Recognition” was presented at NeurIPS 2023
14.11.23 📄 “Visalizing the Diversity of Representations Learned by Bayesian Neural Networks” was accepted at TMLR
30.09.23 🎙️ Presented “Mark My Words: Dangers of Watermarked Images in ImageNet” at XI-ML workshop at ECAI 2023 in Krakau, Poland
03.10.23 🎙️ Invited talk at MunichNLP (Video)
22.09.23 📄 “Labeling Neural Representations with Inverse Recognition” was accepted at NeurIPS 2023
11.09.23 📚 Participated in the Weizenbaum BIFOLD Summer School 2023
26.08.23 ⛵️ Sailed around the island of Majorca
26.07.23 🎙️ Presented “Finding Spurious Correlations with Function-Semantic Contrast Analysis” at xAI 2023 in Lisboa, Portugal
05.07.23 🎙️ Invited talk “Explainable AI: from local to global” at Max Delbrück Center
03.07.23 📄 “DORA: Exploring Outlier Representations in Deep Neural Networks” accepted at TMLR
24.05.23 📝 Published my first blog On Mechanical Consciousness
01.05.23 🎙️ Presenting 2 papers: “DORA: Exploring Outlier Representations in Deep Neural Networks” and “Mark My Words: Dangers of Watermarked Images in ImageNet” at ICLR 2023 at TrustML-(Un)Limited workshop in Kigali, Rwanda.
28.03.23 📄 2 papers accepted at ICLR2023 at TrustML-(Un)Limited workshop
23.06.22 📚 Participated in the Weizenbaum BIFOLD Summer School 2022
02.06.22 🎙️ Panel discussion on Fair and Trustworthy AI at HelmholtzAI2022 conference
25.02.22 🎙️ Presented “NoiseGrad: Enhancing Explanations by Introducing Stochasticity to Model Weights” at AAAi 2022 conference
02.12.21 📄 “NoiseGrad: Enhancing Explanations by Introducing Stochasticity to Model Weights” acepted at AAAi 2022 conference
18.10.21 🎙️ Invited lecture on Explainable AI at Saint-Petersburg State University Graduate School of Management (Video in Russian)
23.05.21 🎙️ Invited talk on “Explaining hidden representations” at ODS DataFest 2021 (Video in Russian)
01.01.21 🐣 Start of the PhD