Notion, a popular workspace platform, has developed world-class AI features by integrating generative AI models such as GPT-4 into its product. Notion's commitment to customer experience led them to invest in AI development, resulting in innovative features like Notion AI, which powers capabilities for searching workspaces, generating documents, analyzing PDFs, and answering questions. However, the team realized their existing workflows for evaluating AI products were not up to the challenge, prompting a partnership with Braintrust to dramatically improve their evaluation workflow, enabling faster iterations and higher quality features in production. With Braintrust, Notion has transformed its eval workflow, allowing it to triage and fix 30 issues per day compared to just 3 previously, setting a new standard for AI product teams to evaluate and improve their generative AI products.