/plushcap/analysis/vonage/vonage-real-time-call-transcription-ibm-watson-python-dr

Real-time Call Transcription Using IBM Watson and Python

What's this blog post about?

Nexmo's WebSocket feature allows for real-time audio streaming from phone calls, enabling various applications such as two-way conversations with AI bots, sentiment analysis, or keyword tracking. To utilize this feature, speech recognition or transcription is required to convert the audio into text, a process that can be performed in real-time using AI platforms like IBM Watson. The connection between Nexmo and Watson involves a relay server due to their different interfaces, requiring the use of a WebSocket interface for seamless communication. By establishing a WebSocket connection with Watson, users can receive transcription messages with confidence scores, allowing them to refine or correct the output as needed. The process also includes handling incoming messages from Vonage, parsing audio parameters, and sending requests to Watson to initiate transcription. When the call ends, Nexmo closes the WebSocket connection, triggering an action to stop the transcription stream in Watson.

Company
Vonage

Date published
Nov. 5, 2020

Author(s)
Sam Machin

Word count
1434

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.