Anyscale

Founded in 2019. Privately Held.

External links: homepage | docs | blog | jobs | youtube | twitter | github | linkedin

Large language model (LLM) infrastructure.

Blog posts published by month since the start of

115 total blog posts published.

Switch to word count

Blog content

post title author published words HN
Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Models to Unique Applications Kourosh Hakhamaneshi, Rehaan Ahmad Aug. 11, 2023 5637 308
Machine Learning for Developers Goku Mohandas Jul. 26, 2023 688 -
Processing 2 Billion Images for Stable Diffusion Model Training - Definitive Guides with Ray Series Max Pumperla, Marwan Sarieddine May. 14, 2024 4209 -
Ray Summit 2022 stories - ML Platforms Anyscale Ray Team Mar. 03, 2023 628 -
LiveEO supercharges their ML infrastructure and accelerates their geospatial workloads practice Toby Rahloff, Phi Nguyen, Alex Streed Apr. 14, 2023 658 -
Anyscale Endpoints: Embedding endpoint, Llama-2 70B fine-tuning and improved sign-up experience Anyscale team Nov. 30, 2023 376 -
Fine-Tuning LLMs: LoRA or Full-Parameter? An in-depth Analysis with Llama 2 Artur Niederfahrenhorst, Kourosh Hakhamaneshi, Rehaan Ahmad Sep. 06, 2023 3597 -
Announcing Ray 2.3: performance improvements, new features and new platforms Richard Liaw, Cade Daniel, Jules S. Damji, Zhe Zhang Feb. 24, 2023 1329 -
Building a Self Hosted Question Answering Service using LangChain + Ray in 20 minutes Waleed Kadous May. 08, 2023 1693 -
How Spotify Built a Robust Ray Platform with a Frictionless Developer Experience Anyscale Ray Team Nov. 09, 2023 1259 -
How continuous batching enables 23x throughput in LLM inference while reducing p50 latency Cade Daniel, Chen Shen, Eric Liang, Richard Liaw Jun. 22, 2023 3568 -
Building an LLM open source search engine in 100 lines using LangChain and Ray Waleed Kadous Apr. 18, 2023 1780 -
10 must-attend Ray Summit sessions: Generative AI, scalable ML workloads, and more Jules S. Damji, Ben Lorica May. 10, 2023 1078 -
Improve Utilization and Simplify Cluster Management with Anyscale Job Queues Dominic Catalano, Alexey Kudinkin Jul. 23, 2024 735 -
Ray Summit 2024 Call for Proposals is now open Anyscale team Apr. 19, 2024 264 -
Anyscale and Meta Collaborate to Advance the Llama-2 Ecosystem Robert Nishihara, Joe Spisak Sep. 07, 2023 325 -
Open Source LLMs: Viable for Production or a Low-Quality Toy? Anyscale Ray Team Nov. 20, 2023 855 -
How ByteDance Scales Offline Inference with multi-modal LLMs to 200 TB Data Amog Kamsetty, Hao Chen, Liguang Xie Aug. 14, 2023 1872 -
Ray 2.5 features training and serving for LLMs, Multi-GPU training in RLlib, and enhanced Ray Data support Richard Liaw, Jules S. Damji Jun. 13, 2023 1681 -
Llama, Scaling Up LLMs in an Open Ecosystem Anyscale Ray Team Oct. 16, 2023 1246 -
Build and Scale a Powerful Query Engine with LlamaIndex and Ray Jerry Liu, Amog Kamsetty Jun. 26, 2023 2524 -
Deploy Ray Serve with up to 50% fewer nodes using Anyscale Replica Compaction Matt Connor, Akshay Malik, Cindy Zhang Jul. 15, 2024 883 -
Training 175B Parameter Language Models at 1000 GPU scale with Alpa and Ray Jiao Dong, Hao Zhang, Lianmin Zheng, Jun Gong, Jules S. Damji, Phi Nguyen Mar. 22, 2023 2713 -
Heterogeneous Training Cluster with Ray at Netflix Anyscale Ray Team Oct. 20, 2023 902 -
Advances in Foundation Models — Technology, Society, and Applications Anyscale Ray Team Nov. 03, 2023 1460 -
Ray 2.6 features streaming for Serve and Train and new Multi-GPU Learner API Jules S. Damji, Richard Liaw Jul. 25, 2023 1426 -
Comparing LLM performance: Introducing the Open Source Leaderboard for LLM APIs Anyscale team Dec. 21, 2023 1202 -
Ray Serve: Tackling the cost and complexity of serving AI in production Akshay Malik, Edward Oakes, Phi Nguyen Sep. 25, 2023 2392 -
We Pre-Trained Stable Diffusion Models on 2 billion Images and Didn't Break the Bank - Definitive Guides with Ray Series Max Pumperla, Marwan Sarieddine May. 21, 2024 4553 -
Now Available: The LLM Router Template Amjad Almahairi Jul. 19, 2024 256 -
Simplify your ML Development Cycle with Anyscale and Weights & Biases Phi Nguyen Jan. 31, 2023 715 -
Why I Joined Anyscale: Solving Cutting-Edge Problems in a Time of Enormous Change Sidney Rabsatt Apr. 19, 2023 260 -
Announcing Ray 2.4.0: Infrastructure for LLM training, tuning, inference, and serving Richard Liaw, Jules S. Damji, Jiajun Yao Apr. 27, 2023 1692 -
Anyscale Endpoints Preview: Fast, Cost-Efficient, and Scalable LLM APIs Ameer Haj Ali, Robin Singh Aug. 03, 2023 363 -
Fine-tuning LLMs for longer context and better RAG systems Artur Niederfahrenhorst, Kourosh Hakhamaneshi Feb. 13, 2024 2847 -
Building Production AI Applications with Ray Serve Anyscale Ray Team Oct. 24, 2023 1213 -
How ThirdAI uses Ray for Parallel Training of Billion-Parameter Neural Networks on Commodity CPUs Vihan Lakshman, Pratik Pranav, Siddharth Jain, Tharun Medini Aug. 29, 2023 1643 -
Ray 2.7 features major stability improvements to Ray AI Libraries and KubeRay and introduces RayLLM Jules S. Damji, Richard Liaw Sep. 18, 2023 1798 -
Optimizing LLM Training with Airbnb's Next-Gen ML Platform Anyscale Ray Team Oct. 30, 2023 1048 -
Accelerating AI: Harnessing Intel(R) Gaudi(R) 3 with Ray 2.10 Ramit Hora Apr. 09, 2024 596 -
Introducing Anyscale’s Unified Log Viewer Alan Guo, Gene Su Jul. 18, 2024 405 -
Cross-modal Search for E-commerce: Building and Scaling a Cross-Modal Image Retrieval App Marwan Sarieddine, Natalia Czerep, Mateusz Kwasniak, Artur Zygadło Jun. 04, 2024 3253 -
Ray breaks the $1/TB barrier as the world’s most cost-efficient sorting system Frank Sifei Luan, UC Berkeley Jan. 25, 2023 1257 -
​​Reinventing Multi-Modal Search with Anyscale and MongoDB Marwan Sarieddine, Kamil Kaczmarek Jul. 25, 2024 5145 -
Practical Data Considerations for Building Production-Ready LLM Applications Anyscale Ray Team Oct. 19, 2023 1116 -
Llama 2 is about as factually accurate as GPT-4 for summaries and is 30X cheaper Waleed Kadous Aug. 23, 2023 2933 -
New on Anyscale: Debug and Optimize Ray Applications Faster with Structured Logging Jiajun Yao, Kai-Hsun Chen Jul. 16, 2024 449 -
Easily Debug Ray Applications with Ray Distributed Debugger Anyscale team May. 15, 2024 624 -
Inference Graphs at LinkedIn Using Ray-Serve Anyscale Ray Team Nov. 09, 2023 1267 -
End-to-end LLM Workflows Guide Goku Mohandas Jun. 17, 2024 4910 -
Building Context-Aware Reasoning Applications with LangChain and LangSmith Anyscale Ray Team Oct. 18, 2023 1214 -
Ray 2.2: Improved developer experience, performance and stability Richard Liaw Jan. 23, 2023 789 -
Building an LLM-powered GitHub bot to improve your pull requests Max Pumperla Nov. 15, 2023 3491 -
Introducing RLlib Multi-GPU Stack for Cost Efficient, Scalable, Multi-GPU RL Agents Training Avnish Narayan, Kourosh Hakhamaneshi Jun. 26, 2023 1058 -
Building an LLM Router for High-Quality and Cost-Effective Responses Amjad Almahairi Jul. 01, 2024 4430 -
Don’t Miss: Hands-On Ray Training at Ray Summit 2024 Kamil Kaczmarek Aug. 13, 2024 788 -
Low-latency Generative AI Model Serving with Ray, NVIDIA Triton Inference Server, and NVIDIA TensorRT-LLM Neelay Shah, Akshay Malik Mar. 13, 2024 642 -
Many Models Batch Training at Scale with Ray Core Jules S. Damji, Antoni Baum Jan. 19, 2023 2178 -
Fine tuning is for form, not facts Waleed Kadous, Kourosh Hakhamaneshi Jul. 05, 2023 1631 -
Introducing the Anyscale Snowflake Connector Eric Greene Jul. 20, 2023 745 -
Reducing the Cost of Pre-training Stable Diffusion by 3.7x with Anyscale Yunxuan Xiao, Hao Chen May. 09, 2024 2176 -
How Ray solves common production challenges for generative AI infrastructure Antoni Baum, Eric Liang, Jun Gong, Kai Fricke, Richard Liaw Mar. 20, 2023 1494 -
Streaming distributed execution across CPUs and GPUs Eric Liang, Stephanie Wang, Cheng Su May. 11, 2023 2067 -
Ray Summit Series - Scaling Parallel Python Jobs Anyscale Ray Team Mar. 16, 2023 599 -
Forecasting at Scale Phi Nguyen, Max Mergenthaler Feb. 02, 2023 683 -
Introducing the Anyscale Databricks Connector Eric Greene Jun. 15, 2023 632 -
Ray Summit 2023 Call for Proposals is now open Jules S. Damji Jan. 12, 2023 777 -
Fast, flexible, and scalable data loading for ML training with Ray Data Stephanie Wang, Scott Lee, Cheng Su, Hao Chen, Eric Liang Sep. 15, 2023 3238 -
Anyscale and Lambda - Addressing AI Scarcity with Engineering Anyscale team Nov. 21, 2023 585 -
RAG at Scale: 10x Cheaper Embedding Computations with Anyscale and Pinecone Scott Lee, Kyle Huang, Cheng Su, Hao Chen Jan. 16, 2024 995 -
Ray 2.8 features Ray Data extensions, AWS Neuron cores support, and Dashboard improvements Jules S. Damji, Richard Liaw Nov. 07, 2023 791 -
Update on Ray CVE-2023-48022: New Verification Tooling Available Anyscale team Mar. 27, 2024 606 -
Update on Ray CVEs CVE-2023-6019, CVE-2023-6020, CVE-2023-6021, CVE-2023-48022, CVE-2023-48023 Anyscale team Nov. 30, 2023 508 -
Ray Spotlight Series: Multitenant Serve Applications with Runtime Envs as Containers Sam Chan, Cindy Zhang Jun. 13, 2024 800 -
How to fine tune and serve LLMs simply, quickly and cost effectively using Ray + DeepSpeed + HuggingFace Waleed Kadous, Jun Gong, Antoni Baum, Richard Liaw Apr. 10, 2023 2055 -
Turbocharge LangChain: guide to 20x faster embedding Amog Kamsetty, Philipp Moritz May. 03, 2023 1934 -
Direct Preference Optimization with Synthetic Data on Anyscale Franklin Wang, Sumanth Hegde, Kourosh Hakhamaneshi Aug. 21, 2024 9249 -
Anyscale Endpoints: JSON Mode, Function calling, New models: Llama Guard and Mistral-7B-OpenOrca Endpoints Team Dec. 12, 2023 186 -
Loading Llama-2 70b 20x faster with Anyscale Endpoints Yi Cheng, Cade Daniel, Chen Shen, Liguang Xie Oct. 11, 2023 1961 -
Portkey ♥️ Anyscale Endpoints Endpoints Team Dec. 12, 2023 564 -
Scaling Model Batch Inference in Ray: Using Actors, ActorPool, and Ray Data Eric Liang, Jules S. Damji, Zhe Zhang May. 16, 2023 1856 -
Numbers every LLM Developer should know Waleed Kadous May. 17, 2023 1423 -
Automatic and optimistic memory scheduling for ML workloads in Ray Clarence Ng, Jules S. Damji Mar. 02, 2023 2423 -
Ray Summit 2022 Stories - Large Language Models Anyscale Ray Team Feb. 16, 2023 680 -
LLM-based summarization: A case study of human, Llama 2 70b and GPT-4 summarization quality Justin Olsson, Waleed Kadous Nov. 09, 2023 1195 -
Welcome Keerti Robert Nishihara Jul. 31, 2024 743 -
Offline Batch Inference: Comparing Ray, Apache Spark, and SageMaker Amog Kamsetty, Eric Liang, Jules S. Damji May. 04, 2023 2042 -
Introducing Elastic Distributed Training on Anyscale Matthew Deng, Justin Yu Jul. 22, 2024 478 -
Why I Joined Anyscale: Powering an Open Source AI Revolution Lance Walter Apr. 28, 2023 799 -
Anyscale Endpoints: JSON Mode and Function calling Features Endpoints Team Dec. 12, 2023 2050 -
Announcing Anyscale Private Endpoints and Anyscale Endpoints Fine-tuning Matt Connor, Robin Singh Oct. 24, 2023 467 -
Cloud Infrastructure for LLM and Generative AI Applications Yifei Feng, Sriram Sankar, Siddharth Venkatesh, Ameer Haj Ali Sep. 14, 2023 1868 -
Building RAG-based LLM Applications for Production Goku Mohandas, Philipp Moritz Oct. 25, 2023 10794 -
Faster stable diffusion fine-tuning with Ray AIR Kai Fricke Mar. 28, 2023 1627 -
Announcing Aviary: Open Source Multi-LLM Serving Waleed Kadous May. 31, 2023 743 -
Reproducible Performance Metrics for LLM inference Waleed Kadous, Kyle Huang, Wendi Ding, Liguang Xie, Avnish Narayan, Ricky Xu Nov. 01, 2023 2495 -
Ray Spotlight: How we delivered Ray weekly releases Sam Chan Jun. 25, 2024 629 -
Inspecting Sewer Line Safety Using Thousands of Hours of Video Lance Walter May. 22, 2023 814 -
Blue River Technology Developers Iterate 2.5X Faster with the Anyscale Fully-Managed Ray Platform Uday Kanwar, Deb Daipayan Feb. 27, 2023 608 -
Scaling Embedding Generation Pipelines From Pandas to Ray Data Marwan Sarieddine Sep. 04, 2024 2154 -
Fine-tuning Llama-3, Mistral and Mixtral with Anyscale Marwan Sarieddine and Kamil Kaczmarek Sep. 11, 2024 2256 -
Building a RAG Batch Inference Pipeline with Anyscale and Union Kevin Su and Kai-Hsun Chen Sep. 12, 2024 1665 -
Roblox Guest Blog: Fast and Efficient Online Model Serving Younes Abouelnagah Sep. 19, 2024 2925 -
Accelerated Metadata Fetching in Ray Data up to 4.5x Faster on Anyscale Balaji Veeramani, Hao Chen, Richard Liaw, Matthew Connor and Praveen Gorthy Oct. 01, 2024 607 -
Anyscale on Kubernetes: Simplifying AI Workloads on User-Managed Infrastructure Dominic Catalano and Yifei Feng Oct. 01, 2024 792 -
Anyscale Now Available on AWS Marketplace and Achieves Generative AI Competency The Anyscale Team Oct. 01, 2024 510 -
Batch LLM Inference on Anyscale slashes AWS Bedrock costs by up to 6x Cody Yu, Scott Lee, Ricky Xu, William Lin, Praveen Gorthy and Richard Liaw Oct. 01, 2024 1180 -
Ray Data GA Hao Chen, Richard Liaw and Praveen Gorthy Oct. 01, 2024 1037 -
Anyscale’s New User Experience: A Comprehensive Overview The Anyscale Team Oct. 01, 2024 1161 -
Anyscale Now on GCP Marketplace The Anyscale Team Oct. 01, 2024 381 -
Autoscaling Large AI Models up to 5.1x Faster on Anyscale Christopher Chou, Austin Kuo, Richard Liaw, Edward Oakes and Chris Sivanich Oct. 01, 2024 1260 -
Enterprise Governance and Observability on Anyscale The Anyscale Team Oct. 01, 2024 479 -
Announcing RayTurbo Akshay Malik, Praveen Gorthy and Richard Liaw Oct. 01, 2024 1453 -
Ray Summit 2024: Breaking Through the AI Complexity Wall The Anyscale Team Oct. 03, 2024 1600 -
Ray Compiled Graphs: Optimized AI Workloads with Native GPU Communication Sang Cho, Sam Chan and Stephanie Wang Oct. 07, 2024 1910 -

By Matt Makai. 2021-2024.