Speechmatics Intros Real-Time Speech Translation Capabilities
Speechmatics plans to amplify its real-time transcription capabilities by providing real-time speech translation in an all-in-one API. The company says breaking down language barriers enables more people to consume content regardless of industry and unlocks the ability to automatically translate live content from multiple regions. This combined offering enables customers to use the world’s most accurate speech-to-text engine and translate speech for 69 language pairs.
Real-time translation follows on a month from Speechmatics’ launch of Ursa, its speech-to-text engine, which the company claims is 25% more accurate than OpenAI’s Whisper and 38% more accurate than Google.
Speechmatics has doubled down on these capabilities to develop real-time translation, offering language pairs to and from English‚including German, Spanish, and Vietnamese. The all-in-one API can also translate multiple languages in one request; for example, a single audio stream can provide real-time English transcription and translation to Japanese, French, Hindi, Mandarin, and Korean simultaneously.
Speechmatics’ real-time transcription and now translation delivers the same level of accuracy as its pre-recorded (batch) service, as well as providing a sliding scale to enable customers to tailor the speed (latency) and/or accuracy to meet their needs. The all-in-one API streamlines processes and speeds up workflows for businesses by combining real-time transcription and translation in one API.
Businesses can reach a wider geographical audience across multiple industries where translating in real-time has previously been a challenging and costly task when completed manually by humans. Particularly for the broadcast industry—which is valued at over $300B in the U.S. alone in 2022—generating quick and highly accurate translated speech in one API unlocks the ability to caption live stream content and news for viewers from around the world. Similarly, for contact centers where scale is essential, contact centers can scale operations to handle multiple languages using cost-effective automation technology and offer improved customer experiences in native languages.
“This is a landmark development for speech recognition technology, and we are proud to remain at the forefront of innovation, demonstrating the commitment to our mission to understand every voice,” said Damir Derd, head of sales engineering at Speechmatics. “This new offering opens up a truly global market for our customers with almost instant translation from the spoken word. As demand from viewers in different regions increases for TV shows and broadcast, sports, events, podcasts, game streaming, YouTube and social media videos, the need for captioned videos in multiple languages has too. We are excited to launch this capability to our customers in the next few weeks and will be continuing to work towards adding even more languages and enabling the engine to translate between languages, so the default isn’t always English.”
“Speechmatics provides the most accurate speech-to-text on the market for pre-recorded files and live streams,” added Ken Frommert, president of ENCO. “Adding real-time translation to its all-in-one API is game-changing for live broadcast captions. The ability to not only transcribe but now leverage Speechmatics to translate in real-time to provide highly accurate captions globally.”
The product was demonstrated at the 2023 NAB Show in Las Vegas in April. Click here for more information and an early-bird offer.