Company
Date Published
July 24, 2024
Author
By LangChain
Word count
2006
Language
English
Hacker News points
None

Summary

The text discusses experiments conducted on two datasets to improve LLM tool-calling performance using few-shot prompting. Few-shot prompting involves providing example model inputs and desired outputs in the model prompt, which can significantly boost model performance across a range of tasks. Experiments were run on Query Analysis and Multiverse Math datasets with different OpenAI and Anthropic models. Results showed that few-shotting with semantically similar examples as messages generally performed better than using static examples or all examples combined. The Claude models improved more with few-shotting compared to GPT models, and smaller models with few-shot examples could rival the zero-shot performance of larger models. Future work includes exploring negative few-shot examples, methods for semantic search retrieval, optimal number of few-shot examples, and comparing trajectories in agentic workloads.