Thursday 11:05 in room 1.38 (ground floor)

Routing Strategies for Heterogeneous GenAI Systems: Lessons from Real-World Practice

Oliver Zeigemann

Modern Generative AI (GenAI) systems combine prompts, language models, inference servers, and specialized hardware into sophisticated stacks. As no single large GenAI system excels at all tasks, we at Techniker Krankenkasse are increasingly adopting a multi-system approach, employing different models tailored to specific tasks, domains, cost, or latency requirements. While this approach enhances robustness and efficiency, it introduces a critical operational challenge: effectively routing each incoming query to the most suitable GenAI system.

In this talk, we present our real-world experiences developing dynamic routing pipelines for selecting the optimal GenAI system based on input content and task specificity. We detail the evolution and refinement of our routing strategies, including:

We share insights and best practices from our real-world implementation experience.

Oliver Zeigemann

Oliver Zeigermann has been developing software for 40 years, progressing from assembly language to C, then Python, and ultimately to machine learning. He currently works as a machine learning engineer at Techniker Krankenkasse.