Company
Date Published
Author
Ken Ruf
Word count
1640
Language
English
Hacker News points
None

Summary

A speech-to-image app was built using Replicate's API and Svix webhooks, allowing users to speak an image into existence. The app uses Next.js for both frontend and backend development and utilizes Bytescale for file storage. It starts by capturing audio from the user's microphone and uploading it to ByteScale, then sends the uploaded audio to a webhook endpoint on Replicate's API to get a transcription. After verifying the webhook signature, the app extracts the transcribed text and uses it as input for the text-to-image model, generating an image in response. The entire process is automated using webhooks, allowing users to interact with the app without manual intervention.