Company
Date Published
Author
Anyscale team
Word count
376
Language
English
Hacker News points
None

Summary

We are excited to announce that Anyscale's LLM API Offering, including Private Endpoints for self-hosted LLMs, is now available as part of the Anyscale Platform. This marks a significant milestone in our journey towards democratizing access to large language models. With our new endpoint offering, developers can easily integrate open-source embedding models into their applications, such as retrieval-augmented generation (RAG) applications, at an affordable price point of $0.05/MTokens for the gte-large model. We also plan to add more models in the future and invite users to request newer embedding models through a Google form. Additionally, we have extended fine-tuning functionality to our Llama-2 70B model, allowing developers to improve model quality while reducing costs and improving performance. The fine-tuned model can now be used for inference at $1/M tokens. Furthermore, users can now get started with Anyscale Endpoints without a credit card, receiving free credits that can be added to their account later.