Vision

Upload a file to MediaCatch Vision API and get the results

Upload a file to MediaCatch Vision API.

Parameters:

Name	Type	Description	Default
`fpath`	`str`	File path.	required
`type`	`Literal['ocr', 'face']`	Type of inference to run on the file.	required
`url`	`str`	URL to the vision API. Defaults to 'https://api.mediacatch.io/vision'.	`'https://api.mediacatch.io/vision'`
`api_key`	`str`	API key for the vision API. Defaults to None.	`None`
`fps`	`int`	Frames per second for video processing. Defaults to 1.	`None`
`tolerance`	`int`	Tolerance for text detection. Defaults to 10.	`None`
`min_bbox_iou`	`float`	Minimum bounding box intersection over union for merging text detection. Defaults to 0.5.	`None`
`min_levenshtein_ratio`	`float`	Minimum Levenshtein ratio for merging text detection (more info here: https://rapidfuzz.github.io/Levenshtein/levenshtein.html#ratio). Defaults to 0.75.	`None`
`moving_threshold`	`int`	If merged text detections center moves more pixels than this threshold, it will be considered moving text. Defaults to 50.	`None`
`max_text_length`	`int`	If text length is less than this value, use max_text_confidence as confidence threshold. Defaults to 3.	`None`
`min_text_confidence`	`float`	Confidence threshold for text detection (if text length is greater than max_text_length). Defaults to 0.5.	`None`
`max_text_confidence`	`float`	Confidence threshold for text detection (if text length is less than max_text_length). Defaults to 0.8.	`None`
`max_height_width_ratio`	`float`	Discard detection if height/width ratio is greater than this value. Defaults to 2.0.	`None`
`get_detection_histogram`	`bool`	If true, get histogram of detection. Defaults to False.	`None`
`detection_histogram_bins`	`int`	Number of bins for histogram calculation. Defaults to 8.	`None`
`max_height_difference_ratio`	`float`	Determine the maximum allowed difference in height between two text boxes for them to be merged. Defaults to 0.5.	`None`
`max_horizontal_distance_ratio`	`float`	Determine if two boxes are close enough horizontally to be considered part of the same text line. Defaults to 0.9.	`None`
`get_frame_index`	`bool`	If true, get frame index. Defaults to None.	`None`
`get_bbox`	`bool`	If true, get bounding box. Defaults to None.	`None`
`face_recognition`	`bool`	If true, run face recognition. Defaults to None.	`None`
`face_age`	`bool`	If true, get face age. Defaults to None.	`None`
`face_gender`	`bool`	If true, get face gender. Defaults to None.	`None`
`face_expression`	`bool`	If true, get face expression. Defaults to None.	`None`
`face_ethnicity`	`bool`	If true, get face ethnicity. Defaults to None.	`None`
`max_retries`	`int`	Maximum number of retries. Defaults to 5.	`5`
`delay`	`float`	Delay between retries. Defaults to 10.0.	`10.0`
`verbose`	`bool`	If True, print log messages. Defaults to True.	`True`

Returns:

Name	Type	Description
`str`	`str`	File ID.

Wait for result from a URL.

Parameters:

Name	Type	Description	Default
`file_id`	`str`	The file ID to get the result from.	required
`url`	`str`	The URL to get the result from.	`'https://api.mediacatch.io/vision'`
`timeout`	`int`	Timeout for waiting in seconds. Defaults to 3600.	`3600`
`delay`	`int`	Delay between each request. Defaults to 10.	`10`
`verbose`	`bool`	If True, print log messages. Defaults to True.	`True`

Returns:

Type	Description
`dict[str, Any] \| None`	dict[str, Any] \| None: Dictionary with the result from the URL or None if failed.