This blog post discusses the importance of monitoring large language models (LLMs) and how to get started with monitoring a LangChain application using LangKit and WhyLabs. The article highlights various metrics that can be tracked for LLM usage and performance, such as response relevance, sentiment, jailbreak similarity, topic, and toxicity. It also provides an example of how to use LangKit with Langchain and OpenAI for LLM monitoring, focusing on tracking sentiment changes between prompts and responses. The post concludes by emphasizing the significance of monitoring large language models in production and suggests other relevant signals that can be monitored using LangKit.