Introducing function calling and structured output for open-source and fine-tuned LLMs

Company

Baseten

Date Published

Sept. 12, 2024

Author

Bryce Dubayah, Philip Kiely

Word count

604

Language

English

Hacker News points

None

URL

www.baseten.co/blog/function-calling-and-structured-output-for-llms

Summary

A new feature has been introduced in TensorRT-LLM Engine Builder to generate structured output during LLM inference. This includes JSON mode, where model output matches a given JSON schema, and function calling, where the LLM selects from provided tools to accomplish a task. Both functionalities have no marginal impact on tokens per second and are available for all LLMs deployed using the Engine Builder. The new features aim to address challenges in integrating LLMs with structured data, enabling developers to call LLMs with guaranteed output structure while adding negligible latency.