Speech To Text API

All the available APIs needed to generate transcript from audio files.

Speech to Text

This endpoint is called for transcribing an audio file.

POST https://api.convai.com/stt/

The user can send the audio file they want to transcribe to this endpoint and get the transcript in the response.

This endpoint also has an option for enabling time stamps, which will provide the timestamp along with the transcript.

Headers

NameTypeDescription

CONVAI-API-KEY*

String

User's Convai API Key

Request Body

NameTypeDescription

file*

Audio File

The audio file that the user wants to transcribe.

Accepted Formats: wav / mp3

enableTimestamps

Boolean

Set to True if the user wants time stamps along with the transcript else False.

Default: False

{ 
	"result" : "<the complete transcription of the audio file>",
	"details": [
		{
			"id": "<sub-trsancript order number>",
			"start-time" : "<starting time, accurate up to milliseconds>",
			"end-time" : "<ending time, accurate up to milliseconds>",
			"text": "<sub-transcript for the time interval>"
		},
		.
		.

Here some ample codeto demonstrate the request format for th endpoint -->

import requests

url = "https://api.convai.com/stt/"

payload={
	"enableTimestamps": "<True or False>"	# Dont need to set if False (default).
}
files=[
  ('file',('audio.wav',open('<path to audio file>','rb'),'audio/wav'))
]
headers = {
  'CONVAI-API-KEY': '<your api key>'
}

response = requests.request("POST", url, headers=headers, data=payload, files=files)

print(response.text)

Please note currently the API only supports and format for audio files. Sending audio files of other formats such as aac, flac, etc will result in a rror.

Note: The audio should have a bit depth of at least 16 bits or higher.

Add Words

Adding specific words to be focused upon during the Speech-To-Text processing

POST https://api.convai.com/stt/add-words

This API is called to add new words that the user wants to focus on during the Speech to Text processing. Users can use this endpoint to add uncommon words, that they expect in their audio files and want the Speech-to-Text system to correctly recognize them.

Headers

NameTypeDescription

CONVAI-API-KEY*

String

User's Convai API Key

Request Body

NameTypeDescription

word*

String

The word, the user wants to add.

{
    "ERROR": "<Error related to API key>"
}

Here are some sample codes to demonstrate the request format for the endpoint -->

import requests
import json

url = "https://api.convai.com/stt/add-words"

payload = json.dumps({
  "word": "<new word>"
})
headers = {
  'CONVAI-API-KEY': '<your api key>',
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Last updated