I'm an AI Research Scientist at the ETH AI Center and co-leader of the LLM effort of the Swiss AI Initiative. With a large team of engineers, researchers, and students throughout multiple Swiss institutions, we are currently developing the largest open-source and responsibly-trained large language model (LLM) in the world. I also teach a novel course at ETHZ called Large-Scale AI Engineering, where we provide hands-on, practical training to MSc graduates on how to efficiently train large distributed neural networks using the Alps supercomputer.

Bio. I began my career with an apprenticeship in informatics at a Swiss bank and my military service. I then earned my BSc in Computer Science from FHNW and MSc in Artificial Intelligence with distinction from the University of St Andrews, Scotland. I completed my PhD with distinction at USI/IDSIA under Prof. Jürgen Schmidhuber in 2023, focusing on systematic generalisation of neural networks and fast weight programmers—scalable self-modifying neural architectures (thesis). My research journey included internships at Meta FAIR, Google Research, and Microsoft Research, where I explored foundational questions in neural computation, scalable neural network architectures, and LLMs. After my PhD I worked with Prof. Thomas Hofmann before moving to the ETHZ AI Center.

Opportunities

Students. We welcome motivated MSc students from ETHZ, EPFL, and other universities to join our research efforts through a semester project, MSc thesis, or a student assistant position. We offer research and engineering opportunities across various topics like LLM development, high-performance infrastructure, and responsible AI. Students can apply through our application form.

Engineers. We're actively hiring machine learning research engineers to join our team developing cutting-edge foundation models in collaboration with researchers through the Swiss AI Initiative. Open positions are available through ETHZ or EPFL.

Research Focus

My research centers on three interconnected areas that advance both the capabilities and responsibility of large-scale AI systems.

First, I focus on developing state-of-the-art open-source LLMs that are transparent and compliant with current legal frameworks. This work provides a foundation for society to build trustworthy AI products and services while enabling researchers to better understand the benefits and risks of LLM-based systems.

Second, I advance neural architecture research through fast weight programmers such as the DeltaNet, which contributes to the most significant architectural innovation since the rise of the Transformer. Similar to linear RNNs, like Mamba or RWKV, it offers enhanced efficiency and generality compared to attention-based architectures.

Third, I investigate fundamental questions around LLM scaling and generalisation. In particular, how to train these systems more efficiently and enable them to generalise beyond their current limitations. This includes exploring self-modifying neural networks as a pathway toward more general AI systems.

Read more

Recent News

Jun 2025pan.talk keynote on Swiss AI Initiative: The Path to AI Sovereignty

Jun 2025 — Grant accepted "A Swiss-Centric Foundation Model for Switzerland's Sovereign AI Future"

Jun 2025 — Grant accepted "Democratizing LLMs for Global Languages with Mixtures of Multilingual Experts"

Jun 2025 — Successfully taught the first iteration of our MSc course at ETHZ: Large-Scale AI Engineering

May 2025 — Presentation of the Swiss AI Initiative to European Commission with EU delegation from each member state

May 2025 — Invited talk at FH Graubünden AI event presenting the Swiss AI Initiative and our LLM effort

Mar 2025Keynote at Swiss Legal Tech Conference

Mar 2025Keynote at the AI in Marketing conference (400+ people)

Mar 2025 — Invited talk at HPC-AI Conference on the Swiss AI Initiative and our LLM Effort

Mar 2025 — Invited talk at GenAI 360

Mar 2025Redefining AI Podcast appearance (Season 3, Ep. 17)

Mar 2025 — Expert input to SRF Echo der Zeit episode

Dec 2024Zürich NLP Meetup talk on "The Swiss AI LLM Effort: Building Transparent and Responsible AI for Switzerland and Beyond"

Dec 2024 — Contributed talk at Swiss Community Day on Data

Dec 2024 — Keynote and panel at EY National Trusted AI Conference with Marc Stampfli and Anne Scherer

Nov 2024 — Invited talk at DeepMind, London on Linear Transformers and DeltaNet

Nov 2024SRF KI Fachrunde appearance

Oct 2024 — Invited talk at AI+X conference at the Swiss AI Initiative workshop

Sep 2024 — Invited talk at ETH-wide AI Upskilling «Die Magie der KI entschlüsseln»

May 2024 — Invited talk at 2024 IEEE Switzerland Section General Assembly

May 2024 — Invited talk at the Swiss publisher association (Verlegerverband) on Large Language Models and the Swiss AI Initiative

May 2024Marketing Booster Podcast appearance

Feb 2024 — Started a position as research scientist at the ETH AI Center

Oct 2023 — Started a postdoctoral position at ETHZ with Prof. Thomas Hofmann

Aug 2023 — Invited talk at IBM on Linear Transformers and DeltaNet

May 2023 — Defended my PhD on Fast Weight Programmers for Greater Systematic Generalisation in Language with distinction.

Selected Publications

Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
D. Fan, V. Sabolčec, M. Ansaripour, A.K. Tarun, M. Jaggi, A. Bosselut, I. Schlag — Preprint 2025

INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
A. Romanou, N. Foroutan, A. Sotnikova, Z. Chen, et al. — ICLR 2024

On the Effect of (Near) Duplicate Subwords in Language Modelling
A. Schäfer, T. Hofmann, I. Schlag, T. Pimentel — ACL 2024

Large Language Model Programs
I. Schlag, S. Sukhbaatar, A. Celikyilmaz, W. Yih, J. Weston, J. Schmidhuber, X. Li — Preprint 2023

A Modern Self-Referential Weight Matrix That Learns to Modify Itself
K. Irie*, I. Schlag*, R. Csordás, J. Schmidhuber — ICML 2022

Linear Transformers are Secretly Fast Weight Programmers
I. Schlag*, K. Irie*, J. Schmidhuber — ICML 2021

View all publications