Content Deep Dive
33% faster LLM inference with FP8 quantization
Company
Baseten
Date Published
March 14, 2024
Author
Pankaj Gupta, Philip Kiely
Word count
1876
Language
English
Hacker News points
None
URL
www.baseten.co/blog/33-faster-llm-inference-with-fp8-quantization
Summary
No summary generated yet.