About - Your Research Blog

I'm an AI Research Scientist at the ETH AI Center and co-lead of Apertus developed as part of the Swiss AI Initiative. With a large team of engineers, researchers, and students throughout multiple Swiss institutions, we are currently developing the largest open-source, responsibly-trained, and compliant large language model (LLM) in the world. I also teach a novel course at ETHZ called Large-Scale AI Engineering, where we provide hands-on, practical training to MSc graduates on how to efficiently train large distributed neural networks using the Alps supercomputer by CSCS.

Bio. I began my career with an apprenticeship in informatics at a Swiss bank followed by my military service. I then earned my BSc in Computer Science from FHNW and MSc in Artificial Intelligence with distinction from the University of St Andrews, Scotland. I completed my PhD with distinction at USI/IDSIA under Prof. Jürgen Schmidhuber in 2023, focusing on systematic generalisation of neural networks and fast weight programmers—scalable self-modifying neural architectures (thesis). During my PhD I was invited to join Meta FAIR, Google Research, and Microsoft Research for research internships, where I investigated foundational questions in neural computation, scalable neural network architectures, and LLMs. After my defense I worked with Prof. Thomas Hofmann as a postdoc before joining the ETH AI Center staff.

Opportunities

Students. We welcome motivated MSc students from ETHZ, EPFL, and other universities to join our research efforts through a semester project, MSc thesis, or a student assistant position. We offer research and engineering opportunities across various topics like LLM development, high-performance infrastructure, and responsible AI. Students can apply through our application form.

Engineers. We have an engineering team developing cutting-edge foundation models in collaboration with researchers through the Swiss AI Initiative. While we are not actively hiring at the moment, we are always interested in connecting with motivated and talented individuals across all areas of LLM development. Feel free to reach out if you're interested in future opportunities.

Research Focus

My research centers on three interconnected areas that advance both the capabilities and responsibility of large-scale AI systems.

First, I focus on developing Apertus, a state-of-the-art open-source LLMs that is transparent and compliant with current legal frameworks. This work provides a foundation for society to build trustworthy AI products and services while enabling researchers to better understand the benefits and risks of LLM-based systems.

Second, I advance neural architecture research through fast weight programmers such as the DeltaNet, which contributes to the most significant architectural innovation since the rise of the Transformer. Similar to linear RNNs, like Mamba or RWKV, it offers enhanced efficiency and generality compared to attention-based architectures. Recently DeltaNet became a core component of Qwen3-Next which is a major release from one of the leading AI labs in the world.

Third, I investigate fundamental questions around LLM scaling and generalisation. In particular, how to train these systems more efficiently and enable them to generalise beyond their current limitations. This includes exploring self-modifying neural networks as a pathway toward more general AI systems.

Recent News

Nov 2025 — Invited talk at the Microsoft Spatial AI Lab

Nov 2025 — Invited talk at the Swisscom Event «AI Governance, Risk und Compliance (GRC) für schweizer Banken»

Nov 2025 — Invited talk and discussion at the Schulthess Forum with legal experts from Swiss law firms and the Swiss Federal Institute of Intellectual Property (IPI) on intellectual property and copyright

Nov 2025 — Invited talk on the Swiss AI Initiative and Apertus at the International Workshop on Pretraining and Posttraining of Sovereign Foundation Models in Berlin

Nov 2025 — Panel discussion with members of the Anthropic team located in Zürich at the ETH AI Center

Nov 2025 — Keynote at the AI Breakfast Vontobel

Nov 2025 — Keynote and hackathon jury during the Innovation Day at Swissquote in Gland

Nov 2025 — Guest lecture in the course "Ethics of Digital Transformation and AI" by Prof. Dr. theol. lic. phil. Peter G. Kirchschläger at the University of Lucerne

Oct 2025 — Apertus featured in 3sat and SRF nano TV report on open AI models for European independence

Oct 2025 — Hosted the Apertus workshop and Swiss AI SME workshop at the AI+X Summit, and presented to the general public on why Switzerland is building its own language models at the Public Night

Oct 2025 — Invited talk at the Language Technologies Lab of the Barcelona Supercomputing Center (BSC), the team behind the Salamandra models

Oct 2025 — Guest lecture for CS-461 at EPFL on LLM development and Apertus

Oct 2025 — Invited talk at HSLU on the Swiss AI Initiative and Apertus

Sep 2025 — Received an award for our work on Apertus at the Culture & Society AI Awards Night

Sep 2025 — Gave a keynote at the Zürich AI Safety Day

Sep 2025 — 2 papers accepted at NeurIPS 2025 (1, 2) and 1 oral paper accepted at COLM 2025 (link)

Sep 2025 — Met with National Council members Gerhard Andrey and Benoit Gaillard (with Chris Beyeler, Judith Niederberger, and Alberto Pasquale Ferrara from KImpact) on AI legislation

Sep 2025 — Follow-up interview with 10vor10 on national TV

Sep 2025 — Inside AI Podcast appearance hosted by Marcel Salathé (EPFL AI Center)

Sep 2025 — Presented the Swiss AI Initiative and Apertus to the SRG SSR AI & Data Guild

Sep 2025 — Keynote at the eHealth Summit presenting Apertus and generative AI use cases in health

Sep 2025 — Keynote on Apertus at the Trustworthy AI in Practice event by LatticeFlow

Sep 2025 — The newly released Qwen3-Next model uses DeltaNet, which I developed with Kazuki Irie, to improve LLM efficiency at scale!

Sep 2025 — Presentation on Apertus to KImpact - Verband für künstliche Intelligenz

Sep 2025 — Keynote on the Swiss AI Initiative and Apertus at the EnhanceR Symposium

Sep 2025 — Interview on Apertus with 10vor10 on national TV

Sep 2025 — 🎉 Released Apertus 8B and 70B LLMs trained on 15T tokens while fully-open and compliant with Swiss law and EU AI Act

Aug 2025 — Presentation on the Swiss AI Initiative and our LLM effort at the AI Meetup for Business Leaders

Jul 2025 — Prompt Zero Podcast appearance by Blick (in Swiss German)

Jul 2025 — Keynote at the first International Open-Source Model Builder Summit before the AI for Good Summit in Geneva

Jun 2025 — pan.talk keynote on Swiss AI Initiative: The Path to AI Sovereignty

Jun 2025 — Grant accepted "A Swiss-Centric Foundation Model for Switzerland's Sovereign AI Future"

Jun 2025 — Grant accepted "Democratizing LLMs for Global Languages with Mixtures of Multilingual Experts"

Jun 2025 — Successfully taught the first iteration of our MSc course at ETHZ: Large-Scale AI Engineering

May 2025 — Presentation of the Swiss AI Initiative to European Commission with EU delegation from each member state

May 2025 — Invited talk at FH Graubünden AI event presenting the Swiss AI Initiative and our LLM effort

Mar 2025 — Keynote at Swiss Legal Tech Conference

Mar 2025 — Keynote at the AI in Marketing conference (400+ people)

Mar 2025 — Invited talk at HPC-AI Conference on the Swiss AI Initiative and our LLM Effort

Mar 2025 — Invited talk at GenAI 360

Mar 2025 — Redefining AI Podcast appearance (Season 3, Ep. 17)

Mar 2025 — Expert input to SRF Echo der Zeit episode

Dec 2024 — Zürich NLP Meetup talk on "The Swiss AI LLM Effort: Building Transparent and Responsible AI for Switzerland and Beyond"

Dec 2024 — Contributed talk at Swiss Community Day on Data

Dec 2024 — Keynote and panel at EY National Trusted AI Conference with Marc Stampfli and Anne Scherer

Nov 2024 — Invited talk at DeepMind, London on Linear Transformers and DeltaNet

Nov 2024 — SRF KI Fachrunde appearance

Oct 2024 — Invited talk at AI+X conference at the Swiss AI Initiative workshop

Sep 2024 — Invited talk at ETH-wide AI Upskilling «Die Magie der KI entschlüsseln»

May 2024 — Invited talk at 2024 IEEE Switzerland Section General Assembly

May 2024 — Invited talk at the Swiss publisher association (Verlegerverband) on Large Language Models and the Swiss AI Initiative

May 2024 — Marketing Booster Podcast appearance

Feb 2024 — Started a position as research scientist at the ETH AI Center

Oct 2023 — Started a postdoctoral position at ETHZ with Prof. Thomas Hofmann

Aug 2023 — Invited talk at IBM on Linear Transformers and DeltaNet

May 2023 — Defended my PhD on Fast Weight Programmers for Greater Systematic Generalisation in Language with distinction.

Selected Publications

Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
D. Fan, V. Sabolčec, M. Ansaripour, A.K. Tarun, M. Jaggi, A. Bosselut, I. Schlag — COLM 2025

INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
A. Romanou, N. Foroutan, A. Sotnikova, Z. Chen, et al. — ICLR 2024

On the Effect of (Near) Duplicate Subwords in Language Modelling
A. Schäfer, T. Hofmann, I. Schlag, T. Pimentel — ACL 2024

Large Language Model Programs
I. Schlag, S. Sukhbaatar, A. Celikyilmaz, W. Yih, J. Weston, J. Schmidhuber, X. Li — Preprint 2023

A Modern Self-Referential Weight Matrix That Learns to Modify Itself
K. Irie*, I. Schlag*, R. Csordás, J. Schmidhuber — ICML 2022

Linear Transformers are Secretly Fast Weight Programmers
I. Schlag*, K. Irie*, J. Schmidhuber — ICML 2021

View all publications