Together Rerank API offers a new serverless endpoint for integrating reranker models into enterprise applications, with exclusive access to Salesforce's LlamaRank model, which outperforms leading competitors like Cohere Rerank v3 and Mistral-7B QLM. The API provides a seamless developer experience, allowing users to build and manage their entire generative AI lifecycle from training and fine-tuning to inference, using both open and proprietary models. It supports long document sizes up to 8,000 tokens in length and can handle semi-structured data such as JSON, email, tables, and code. The API is compatible with Cohere Rerank, enabling easy experimentation with different models for RAG applications, and provides a flexible solution for enhancing search accuracy and reducing costs by filtering out irrelevant documents that are passed to language models during Retrieval Augmented Generation (RAG).