Skip to content

Speech

Upload a file to MediaCatch Speech API and get the results

Uploads a file to MediaCatch Speech API.

Parameters:

Name Type Description Default
fpath str | Path

Path to the file to upload.

required
api_key str

API key for the vision API. Defaults to None.

None
quota str

The quota to bill transcription hours from. Can be None if the user only has one quota. Defaults to None.

None
fallback_language str

Overrides the language to transcribe in if language identification fails. If None, uses the default language of the quota. Defaults to None.

None
max_threads int

Number of maximum threads. Defaults to 5.

5
max_request_retries int

Number of maximum retries for request. Defaults to 3.

3
request_delay float

Delay between request retries. Defaults to 0.5.

0.5
chunk_size _type_

Size of each chunk to upload. Defaults to 10010241024.

100 * 1024 * 1024
url str

URL of the MediaCatch Speech API. Defaults to 'https://s2t.mediacatch.io/api/v2'.

'https://s2t.mediacatch.io/api/v2'
compress_input bool

Compress the input file to OGG format (Requires FFMPEG >= 6.1). Defaults to False.

False
sample_rate int

Sample rate of the audio file. Defaults to 16000.

16000
verbose bool

Show verbose output. Defaults to True.

True

Returns:

Name Type Description
str str

File ID of the uploaded file.

Wait for result from a URL.

Parameters:

Name Type Description Default
file_id str

The file ID to get the result from.

required
url str

The URL to get the result from.

'https://s2t.mediacatch.io/api/v2'
timeout int

Timeout for waiting in seconds. Defaults to 3600.

3600
delay int

Delay between each request. Defaults to 10.

10
verbose bool

Show verbose output. Defaults to True.

True

Returns:

Type Description
dict[str, Any] | None

dict[str, Any] | None: Dictionary with the result from the URL or None if failed.