/plushcap/analysis/baseten/baseten-faster-mixtral-inference-with-tensorrt-llm-and-quantization

Faster Mixtral inference with TensorRT-LLM and quantization

What's this blog post about?

Company
Baseten

Date published
Dec. 22, 2023

Author(s)
Pankaj Gupta, Timur Abishev, Philip Kiely

Word count
1467

Hacker News points
2

Language
English


By Matt Makai. 2021-2024.