Transcribe Audio Files in an S3 Bucket with AssemblyAI

Company

AssemblyAI

Date Published

March 15, 2022

Author

Ryan O'Connor

Word count

982

Language

English

Hacker News points

None

URL

www.assemblyai.com/blog/transcribing-audio-files-in-an-s3-bucket-with-assemblyai

Summary

The article discusses how to transcribe an audio file stored in an AWS S3 bucket using AssemblyAI's APIs. This is accomplished by generating a presigned URL for the audio file, which provides temporary access rights to the file. This URL is then passed through to AssemblyAI's API with a POST request. After the transcription is complete, it can be fetched with a GET request. The article also explains how to set up an AWS IAM user with Programmatic access and the AmazonS3ReadOnlyAccess permission, as well as how to clone AssemblyAI's GitHub repo and run transcribe_from_s3.py to see the transcription of the S3 audio file printed in the console. The underlying code is detailed, providing users an understanding of what happens under the hood during this process.