Video moderation extracts representative frames from a video and runs vision-language classification on each one. Categories are merged across all frames using OR logic — if a single frame trips a category, the entire video is flagged.Documentation Index
Fetch the complete documentation index at: https://docs.omnifence.ai/llms.txt
Use this file to discover all available pages before exploring further.
Input
Submit amultipart/form-data request with:
| Field | Type | Required | Description |
|---|---|---|---|
video_url | string | Yes | Publicly reachable HTTP or HTTPS URL of the video to scan, up to 300 MB. The worker downloads it for frame extraction. |
webhook_url | string | No | URL to receive the result on completion. |
The
video_url must be a public HTTP or HTTPS URL. URLs using another scheme, or resolving to a
private or internal network address, are rejected with 400 INVALID_REQUEST. The linked video
must be no larger than 300 MB — a larger file is rejected during frame extraction and the job
ends as failed.Pipeline
Video moderation runs in two stages:Frame extraction
A worker samples frames from the video at regular intervals using FFmpeg. The sampling interval is configured per deployment (default: every 5 seconds). The extracted frames are queued for individual analysis.Frame moderation
Each extracted frame is analysed against the configured visual moderation categories and any enabled custom categories. Frame moderation jobs run in parallel for faster processing. You can track frame-by-frame progress via SSE. Progress events includeframes_completed and frame_count fields.
Category merging
Categories are merged across all frames using OR logic: if any single frame trips a category, that category istrue for the entire video.
For example, if frame 12 of 45 trips nsfw, the final result shows nsfw: true even though
the other 44 frames are clean.
