350 |
PyTorch vs. TensorFlow in 2022 |
2021-12-14 |
288 |
Building an end-to-end Speech Recognition model in PyTorch |
2020-04-17 |
273 |
How to train large deep learning models as a startup |
2021-10-07 |
252 |
How DALL-E 2 Works |
2022-04-19 |
159 |
Differentiable Programming – A Simple Introduction |
2022-04-12 |
142 |
How Imagen Works |
2022-06-23 |
129 |
LeMUR: LLMs for Audio and Speech |
2023-07-27 |
111 |
Build Your Own Imagen Text-to-Image Model |
2022-08-17 |
108 |
The Full Story of Large Language Models and RLHF |
2023-05-03 |
98 |
Introduction to Diffusion Models for Machine Learning |
2022-05-12 |
95 |
How RLHF Preference Model Tuning Works (and How Things May Go Wrong) |
2023-08-09 |
67 |
Fine-Tuning Transformers for NLP |
2021-06-21 |
66 |
Using JAX in 2022 |
2022-02-15 |
53 |
Image Generation with Electrostatics |
2022-11-30 |
49 |
How physics advanced generative AI |
2023-04-19 |
45 |
What is residual vector quantization? |
2023-09-04 |
34 |
PyTorch Lightning for Dummies – A Tutorial and Overview |
2021-12-06 |
29 |
Is Word Error Rate a Good Metric for Speech Recognition Models? |
2021-09-10 |
16 |
A Beginner's Guide to TorchStudio, the PyTorch IDE |
2022-03-28 |
11 |
Universal-1: Robust and accurate multilingual speech-to-text |
2024-04-03 |
11 |
MediaPipe for Dummies |
2022-05-02 |
8 |
How to run Stable Diffusion locally |
2022-08-24 |
8 |
Decoding Strategies – How LLMs Choose the Next Word |
2024-08-21 |
8 |
An Overview of Transducer Models for ASR |
2021-11-08 |
7 |
Recent Developments in Generative AI for Audio |
2023-06-27 |
7 |
Emergent Abilities of Large Language Models |
2023-03-07 |
7 |
Variational Autoencoders for Dummies |
2022-01-03 |
6 |
What AI Music Generators Can Do and How They Do It |
2023-09-24 |
6 |
Kaldi Speech Recognition – A Simple Tutorial |
2022-01-20 |
6 |
A Review of End-to-End Architectures for Speech Recognition |
2021-01-27 |
5 |
AlphaTensor explained – Motivation, method, and assessment |
2022-11-22 |
5 |
Text Segmentation – Approaches, Datasets, and Evaluation Metrics |
2021-11-16 |
5 |
Transcribe audio to text with Cloudflare Workers and AssemblyAI |
2023-08-03 |
5 |
Conformer-2 AI model for speech recognition |
2023-07-20 |
5 |
How to evaluate Speech Recognition AI models |
2023-06-16 |
5 |
Can Podcasts Predict the Stock Market? |
2021-09-02 |
4 |
How ChatGPT actually works |
2023-04-27 |
4 |
Automatically determine video sections with AI using Python |
2023-11-07 |
4 |
Stable Diffusion faster in Keras thanks to XLA |
2022-11-30 |
4 |
DeepSpeech for Beginners – A Tutorial and Overview |
2021-10-14 |
4 |
How to Build a Burner Phone in Python |
2021-08-03 |
3 |
Complete guide to modern generative AI image models |
2023-05-10 |
3 |
Introduction to Generative AI |
2023-05-02 |
3 |
AI trends in 2023: Graph Neural Networks |
2023-03-29 |
3 |
Conformer-1: a robust speech recognition model |
2023-03-15 |
3 |
Why You Should (or Shouldn't) be Using Google's JAX in 2023 |
2023-02-11 |
3 |
Stable Diffusion 1 vs. 2 – What you need to know |
2022-12-06 |
3 |
OpenAI Whisper Benchmarks |
2022-09-27 |
3 |
MinImagen – Build Your Own Imagen Text-to-Image Model |
2022-09-08 |
3 |
Variational Autoencoders Simply Explained |
2022-04-06 |
2 |
AssemblyAI (YC S17) raises $50M Series C to build superhuman Speech AI models |
2023-12-05 |
2 |
RLHF vs. RLAIF for language model alignment |
2023-08-22 |
2 |
Reinforcement Learning from AI Feedback |
2023-08-01 |
2 |
Five Things to know about Large Language Models |
2023-05-23 |
2 |
Introduction to LLMs for Generative AI |
2023-05-17 |
2 |
Unsupervised Machine Learning for Beginners |
2023-02-19 |
2 |
How to Build an Audio Intelligence Dashboard |
2022-09-21 |
2 |
Sentiment Analysis in Action – Earnings Calls [video] |
2021-12-14 |
2 |
The Definitive Guide to Python Click |
2021-07-29 |
2 |
Download and transcribe YouTube videos in Python |
2021-07-13 |
1 |
How Microsoft's New Large Vision Model "Florence-2" Works |
2024-07-15 |
1 |
AssemblyAI announces lower latency, lower cost, more possibilities |
2024-01-11 |
1 |
AI for Universal Audio Understanding: Qwen-Audio Explained |
2023-12-14 |
1 |
Retrieval Augmented Generation (Rag) on Audio Data with LangChain |
2023-09-26 |
1 |
Free Speech-to-Text APIs, AI Models, and Open Source Engines |
2022-12-11 |
1 |
Getting Started with Hugging Face's Gradio |
2022-10-05 |
1 |
JavaScript Text-to-Speech – The Easy Way |
2022-04-05 |
107 |
AssemblyAI: speech-to-text API |
2018-08-13 |
3 |
We Built a Scalable AI Lakehouse at AssemblyAI |
2024-11-22 |
3 |
Golden Gemini – A new approach in Speech AI |
2025-02-04 |