Faster Mixtral inference with TensorRT-LLM and quantization
What's this blog post about?
Company
Baseten
Date published
Dec. 22, 2023
Author(s)
Pankaj Gupta, Timur Abishev, Philip Kiely
Word count
1467
Language
English
Hacker News points
2