308 |
Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Custom Models |
2023-08-11 |
143 |
Llama 2 is about as factually accurate as GPT-4 for summaries and is 30X cheaper |
2023-08-29 |
110 |
Continuous batching to increase LLM inference throughput and reduce p50 latency |
2023-08-15 |
95 |
Numbers every LLM Developer should know |
2023-08-12 |
78 |
ThirdAI Uses Ray for Parallel Training of Billion-Parameter NN on Commodity CPUs |
2023-08-30 |
36 |
Ray breaks the $1/TB barrier as the world’s most cost-efficient sorting system |
2023-01-24 |
24 |
Anyscale's Aviary: Open-Source Multi-LLM Serving |
2023-05-31 |
22 |
Fine-Tuning LLMs: LoRA or Full-Parameter? An In-Depth Analysis with Llama 2 |
2023-09-06 |
22 |
LightGBM vs. XGBoost: Which distributed version is faster? |
2021-08-10 |
14 |
Anyscale's Aviary is a dashboard for evaluating Open Source LLMs |
2023-05-31 |
11 |
Production Guide for Building Rag-Based LLM Applications |
2023-09-13 |
9 |
Multi-node distributed training for PyTorch Lightning made easy |
2021-08-20 |
7 |
ByteDance Scales Offline Inference with Multi-Modal LLMs to 200 TB Data |
2023-08-15 |
5 |
Loading LLM (Llama-2 70B) 20x faster with Anyscale Endpoints |
2023-10-13 |
5 |
Lessons from training a Stable Diffusion model on 2B images |
2024-05-11 |
4 |
Scaling data loading for ML training with Ray Data |
2023-09-15 |
4 |
Cloud Infrastructure for LLM and Generative AI Applications |
2023-09-14 |
4 |
Model Batch Inference in Ray: Actors, ActorPool, and Datasets |
2022-11-04 |
4 |
Ben Lorica blog post on enterprise applications of reinforcement learning |
2020-03-25 |
3 |
Ant Group – scaling to 1.37M QPS on Ray |
2022-12-13 |
3 |
Ant Group Uses Ray to Build a Large-Scale Online Serverless Platform |
2022-12-12 |
3 |
Anyscale Private Endpoints and Anyscale Endpoints Fine-Tuning |
2023-10-24 |
3 |
How to build a LLM search engine using a self-hosted LLM |
2023-04-21 |
3 |
An informal introduction to reinforcement learning |
2022-02-23 |
3 |
Why Third Generation ML Platforms Are More Performant |
2021-10-06 |
3 |
The Third Generation of Production ML Architectures |
2021-09-16 |
3 |
Ray 1.0 |
2020-09-30 |
2 |
Anyscale Appoints Keerti Melkote as CEO |
2024-07-31 |
2 |
Canva Built a Modern AI Platform Using Anyscale |
2024-04-03 |
2 |
Comparing LLM Performance: Introducing the Open Source Leaderboard for LLM APIs |
2023-12-21 |
2 |
Anyscale Endpoints: JSON Mode and Function Calling Features |
2023-12-14 |
2 |
Reproducible Performance Metrics for LLM Inference |
2023-11-02 |
2 |
Fine Tuning is for form not facts |
2023-08-27 |
2 |
Serving PyTorch Models with FastAPI and Ray Serve |
2022-12-17 |
2 |
Ray Datasets for large-scale machine learning ingest and scoring |
2022-02-25 |
2 |
Ray 1.10 Released |
2022-02-25 |
2 |
Parallelizing Python Code |
2021-09-21 |
2 |
Data Processing Support in Ray |
2021-03-08 |
2 |
The Ideal Foundation for a General Purpose Serverless Platform |
2020-11-08 |
2 |
Understanding the Ray Ecosystem and Community |
2020-04-25 |
1 |
Building Highly Available and Scalable Online Applications on Ray at Ant Group |
2021-09-17 |
1 |
Direct Preference Optimization with Synthetic Data on Anyscale |
2024-08-21 |
1 |
Building an LLM Router for High-Quality and Cost-Effective Responses |
2024-07-02 |
1 |
End-to-End LLM Workflows Guide |
2024-06-18 |
1 |
Fine-tuning LLMs for longer context and better RAG systems |
2024-02-13 |
1 |
RAG at Scale: 10x Cheaper Embedding Computations with Anyscale and Pinecone |
2024-01-16 |
1 |
LLM summarization: A case study of human, Llama-2, & GPT-4 summarization quality |
2023-11-10 |
1 |
Anyscale Endpoints: LLM inference and fine-tuning |
2023-10-25 |
1 |
Ray solves common production challenges for generative AI infrastructure |
2023-03-28 |
1 |
Training One Million Machine Learning Models in Record Time with Ray |
2022-12-18 |
1 |
Gang Scheduling Ray Clusters on K8s with Multi-Cluster-App-Dispatcher (MCAD) |
2022-11-16 |
1 |
Redis in Ray: Past and Future |
2022-03-18 |
1 |
Ray 1.11 Released |
2022-03-11 |
1 |
Introducing Anyscale: The Future Is Distributed |
2021-12-10 |
1 |
Analyzing memory management and performance in Dask-on-Ray |
2021-09-29 |
1 |
Ant Group's Resource Allocation System has scaled over 6000 cores |
2021-03-30 |