Google Cloud AI Services Quick Start Guide
上QQ阅读APP看书,第一时间看更新

Cloud Speech API

Cloud Speech API uses powerful neural network models to convert audio to text in real time. This service is exposed as a REST API, as we have seen with the Google Cloud Natural Language API.

This API can recognize over 110 languages and users can use this service to convert speech to text in real time, recognize audio uploaded in the request, and integrate with our audio storage on Google Cloud Storage, by using the same technology Google uses to power its own products.

Before we continue with Cloud Speech API, I would recommend heading over to https://cloud.google.com/speech/ and trying out the API. Here is a quick glimpse of it:

I was actually playing a song in the background and tried the speech-to-text. I was very impressed with the results, except for one part, where I said with a song playing and the API represented it as with the song playing; still, pretty good!

I think it is only a matter of time and continued use of these services that will increase their accuracy.

Some of the key features of Cloud Speech API are:

  • Automatic Speech Recognition (ASR)
  • Global vocabulary
  • Streaming recognition
  • Word hints
  • Real-time or prerecorded audio support
  • Noise robustness
  • Inappropriate content filtering
  • Integrated API

You can read more about Cloud Speech API here: https://cloud.google.com/speech/.