How to deploy DeepSeek-R1 and distilled models securely on Together AI

Company

Together AI

Date Published

Jan. 31, 2025

Author

Together AI

Word count

1004

Language

English

Hacker News points

None

URL

www.together.ai/blog/deploy-deepseek-r1-and-distilled-models-securely-on-together-ai

Summary

DeepSeek-R1 is an open-weight competitor to proprietary reasoning models like OpenAI's o1, delivering powerful reasoning at a fraction of the cost. Together AI offers both the full R1 and distilled models with opt-out privacy controls and serverless pay-per-token pricing, allowing developers to experiment freely without costly GPU deployments. The main DeepSeek-R1 model rivals OpenAI's o1 in reasoning tasks while running 9x cheaper on high-performance serverless infrastructure at a competitive rate of $7 per 1 million tokens. Serverless deployment for R1 matters due to its large size and the advantages it offers, including pay-per-token pricing, high-performance infrastructure, and full flexibility to scale deployments as needed. The distilled models are not the main DeepSeek-R1 model but rather fine-tuned variants that have been trained with reasoning examples generated by DeepSeek-R1. Together AI prioritizes security with opt-out privacy controls and hosts all R1 models in its own data centers ensuring sensitive information remains secure. A free endpoint for DeepSeek-R1 Llama 70B distilled is also available, offering easy access to powerful reasoning models while notes that the free model endpoint has reduced rate limits and performance compared to paid Turbo endpoints.