Vision
Upload a file to MediaCatch Vision API and get the results
Upload a file to MediaCatch Vision API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fpath |
str
|
File path. |
required |
type |
Literal['ocr', 'face']
|
Type of inference to run on the file. |
required |
url |
str
|
URL to the vision API. Defaults to 'https://api.mediacatch.io/vision'. |
'https://api.mediacatch.io/vision'
|
fps |
int
|
Frames per second for video processing. Defaults to 1. |
None
|
tolerance |
int
|
Tolerance for text detection. Defaults to 10. |
None
|
min_bbox_iou |
float
|
Minimum bounding box intersection over union for merging text detection. Defaults to 0.5. |
None
|
min_levenshtein_ratio |
float
|
Minimum Levenshtein ratio for merging text detection (more info here: https://rapidfuzz.github.io/Levenshtein/levenshtein.html#ratio). Defaults to 0.75. |
None
|
moving_threshold |
int
|
If merged text detections center moves more pixels than this threshold, it will be considered moving text. Defaults to 50. |
None
|
max_text_length |
int
|
If text length is less than this value, use max_text_confidence as confidence threshold. Defaults to 3. |
None
|
min_text_confidence |
float
|
Confidence threshold for text detection (if text length is greater than max_text_length). Defaults to 0.5. |
None
|
max_text_confidence |
float
|
Confidence threshold for text detection (if text length is less than max_text_length). Defaults to 0.8. |
None
|
max_height_width_ratio |
float
|
Discard detection if height/width ratio is greater than this value. Defaults to 2.0. |
None
|
get_detection_histogram |
bool
|
If true, get histogram of detection. Defaults to False. |
None
|
detection_histogram_bins |
int
|
Number of bins for histogram calculation. Defaults to 8. |
None
|
max_height_difference_ratio |
float
|
Determine the maximum allowed difference in height between two text boxes for them to be merged. Defaults to 0.5. |
None
|
max_horizontal_distance_ratio |
float
|
Determine if two boxes are close enough horizontally to be considered part of the same text line. Defaults to 0.9. |
None
|
get_frame_index |
bool
|
If true, get frame index. Defaults to None. |
None
|
get_bbox |
bool
|
If true, get bounding box. Defaults to None. |
None
|
face_recognition |
bool
|
If true, run face recognition. Defaults to None. |
None
|
face_age |
bool
|
If true, get face age. Defaults to None. |
None
|
face_gender |
bool
|
If true, get face gender. Defaults to None. |
None
|
face_expression |
bool
|
If true, get face expression. Defaults to None. |
None
|
face_ethnicity |
bool
|
If true, get face ethnicity. Defaults to None. |
None
|
max_retries |
int
|
Maximum number of retries. Defaults to 5. |
5
|
delay |
float
|
Delay between retries. Defaults to 10.0. |
10.0
|
verbose |
bool
|
If True, print log messages. Defaults to True. |
True
|
Returns:
Name | Type | Description |
---|---|---|
str |
str
|
File ID. |
Wait for result from a URL.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_id |
str
|
The file ID to get the result from. |
required |
url |
str
|
The URL to get the result from. |
'https://api.mediacatch.io/vision'
|
timeout |
int
|
Timeout for waiting in seconds. Defaults to 3600. |
3600
|
delay |
int
|
Delay between each request. Defaults to 10. |
10
|
verbose |
bool
|
If True, print log messages. Defaults to True. |
True
|
Returns:
Type | Description |
---|---|
dict[str, Any] | None
|
dict[str, Any] | None: Dictionary with the result from the URL or None if failed. |