/plushcap/analysis/baseten/baseten-function-calling-and-structured-output-for-llms

Introducing function calling and structured output for open-source and fine-tuned LLMs

What's this blog post about?

A new feature has been introduced in TensorRT-LLM Engine Builder to generate structured output during LLM inference. This includes JSON mode, where model output matches a given JSON schema, and function calling, where the LLM selects from provided tools to accomplish a task. Both functionalities have no marginal impact on tokens per second and are available for all LLMs deployed using the Engine Builder. The new features aim to address challenges in integrating LLMs with structured data, enabling developers to call LLMs with guaranteed output structure while adding negligible latency.

Company
Baseten

Date published
Sept. 12, 2024

Author(s)
Bryce Dubayah, Philip Kiely

Word count
604

Language
English

Hacker News points
None found.