113 |
A guide to open-source LLM inference and performance |
2023-11-20 |
112 |
Show HN: Baseten – Build ML-powered applications |
2022-04-26 |
51 |
How we got Stable Diffusion XL inference to under 2 seconds |
2023-08-31 |
9 |
Show HN: Baseten Chains – Framework and SDK for Multi-Model AI Products |
2024-06-27 |
5 |
Serving four million Riffusion requests in two days |
2022-12-21 |
3 |
SDXL inference in under 2 seconds |
2023-08-31 |
3 |
Deploying Stable Diffusion in Production Using Truss |
2022-09-01 |
2 |
Open Source Inference Engine Baseten Raises $40M from IVP, Spark and Greylock |
2024-03-14 |
2 |
Faster Mixtral inference with TensorRT-LLM and quantization |
2023-12-27 |
2 |
How to double tokens per second for Llama 3 with Medusa |
2024-08-20 |
2 |
Show HN: Automatically Build Nvidia TRT-LLM Engines |
2024-08-01 |
2 |
FP8: Efficient model inference with 8-bit floating point numbers |
2024-03-08 |
1 |
How to build function calling and JSON mode for open-source and fine-tuned LLMs |
2024-09-12 |
1 |
Show HN: 60% higher tokens per second for 70B custom LLMs |
2024-07-31 |
1 |
Introduction to quantizing machine learning models |
2024-02-16 |
1 |
Three techniques to adapt LLMs for any use case |
2023-06-15 |
1 |
Accelerating model deployment: 100X faster dev loops with draft models |
2022-12-09 |
1 |
Hotdog or Not Hotdog (Cutedog) |
2021-12-03 |
1 |
Introducing BaseTen — build machine-learning powered applications |
2021-05-23 |
402 |
Show HN: ChatLLaMA – A ChatGPT style chatbot for Facebook's LLaMA |
2023-03-22 |
52 |
DALL-E Mini – Generate images from a text prompt |
2022-06-10 |
25 |
Show HN: Free Stable Diffusion 2.0 hosted interface |
2022-11-24 |
20 |
BaseTen: The fastest way to build ML-powered applications |
2021-05-20 |
16 |
Show HN: Fine-tune generative models in 1 line of code |
2023-03-01 |
7 |
Hosted Stable Diffusion Demo |
2022-08-24 |
5 |
Try it yourself: Speech to text with Whisper |
2022-10-01 |
5 |
How BaseTen is using “docs as code” |
2022-03-09 |
2 |
Code generation interactive demo (Salesforce Codegen mono 2B) |
2022-07-01 |
2 |
Working at an early-stage company as an early-stage engineer |
2021-11-30 |
1 |
Demo – Text generation with EleutherAI's GPT-J-6B model |
2022-04-29 |
1 |
GFP-GAN – Photo Restoration App |
2021-12-31 |
1 |
Transcribing large audio files with wav2vec |
2021-12-15 |
1 |
Deploying custom ComfyUI workflows as APIs |
2024-11-20 |