/plushcap/analysis/assemblyai/lemur-early-access

Introducing LeMUR, our new framework for applying powerful LLMs to transcribed speech

What's this blog post about?

LeMUR is a new framework that enables efficient application of Large Language Models (LLMs) to transcribed speech, overcoming the challenges posed by long audio files. With just one line of code using Python SDK, it can process up to 10 hours worth of audio content or around 150k tokens. This makes LeMUR significantly more effective than off-the-shelf LLMs, which are typically limited to processing only 8k tokens or about 45 minutes of audio. The LeMUR framework achieves this by wrapping a pipeline that includes intelligent segmentation, a fast vector database, and reasoning steps like chain-of-thought prompting and self evaluation. This architecture allows users to send long and multiple audio transcripts into an LLM with a single API call. The LeMUR system also provides reliable and safe outputs by including safety measures and content filters that help prevent the generation of harmful or biased language, as well as the ability for users to provide additional context at inference time, ensuring more personalized and accurate results. Moreover, it offers a modular and fast integration process, consistently returning structured data in consumable JSON format, allowing developers to customize output formats as needed without building custom code to handle LLM outputs. LeMUR is also continuously state-of-the-art, regularly incorporating the latest AI technologies and models. The system is designed for multiple use cases like question and answer, custom summaries, and AI coaching, offering powerful capabilities that can be easily integrated into a wide range of applications. It's currently available on a rate-limited Early Access basis with interested users being able to join the waitlist.

Company
AssemblyAI

Date published
May 9, 2023

Author(s)
-

Word count
963

Language
English

Hacker News points
None found.