Baseten Chains is a framework and SDK for serving highly-performant compound AI systems in production. It enables AI builders to deploy ultra-low-latency compound AI with unique hardware and autoscaling for each step, eliminating performance bottlenecks and model orchestration headaches while keeping inference cost-efficient. The solution addresses challenges such as model orchestration, latency, reliability, and cost associated with deploying compound AI systems in production. With improved performance and developer tooling since beta, Baseten Chains is now generally available for production use, allowing users to build any custom, multi-step, or multi-model AI workflow while gaining the model performance and fluid horizontal scaling that Baseten specializes in. The platform provides a developer experience with features such as output streaming, binary IO, subclassing, Chains Watch, and a linter, making it easier for developers to deploy performant compound AI systems in production.