The LangSmith benchmark evaluates three approaches for semi-structured retrieval augmented generation (RAG) over a mix of unstructured text and structured tables. Approach 1 involves passing documents containing tables directly into the long context LLM context window, while Approach 2 focuses on targeted table extraction using methods like Unstructured or Docugami. Approach 3 splits documents based on a specified token limit, with performance improving as chunk size increases. The ensemble retriever combines rankings from different retrievers to prioritize table-derived text chunks and improve overall performance.