Routing LLM prompts with Dagster and Not Diamond

Company

Dagster

Date Published

Feb. 14, 2025

Author

Colton Padden

Word count

899

Language

English

Hacker News points

None

URL

dagster.io/blog/routing-llm-prompts-with-not-diamond

Summary

When developing an AI application, it's essential to thoroughly evaluate and test each Large Language Model (LLM) to ensure production-ready performance. However, this can be challenging due to the inherent non-determinism of LLMs and numerous options available. A promising solution is to employ an LLM router, which can learn to dynamically select the optimal LLM for each query, thereby enhancing accuracy by up to 25% and reducing both inference costs and latency by a factor of 10. This approach addresses the challenge of selecting the best model overall while offering significant cost savings without compromising performance for most inputs. By using an LLM router in a Dagster pipeline with Not Diamond, developers can create a "meta-model" that integrates multiple models and intelligently determines when to utilize each LLM, thereby surpassing the performance of any individual model and reducing costs and latency. This enables the creation of cost-effective AI workflows by strategically employing smaller, more economical models when performance won't be compromised.