Faster Mixtral inference with TensorRT-LLM and quantization - Plushcap

Company

Date Published

Author

Pankaj Gupta, Timur Abishev, Philip Kiely

Word count

1467

Language

English

Hacker News points

2

URL

www.baseten.co/blog/faster-mixtral-inference-with-tensorrt-llm-and-quantization

Summary

No summary generated yet.