FORGE

One API. Every Media Type. Any Scale.

Forge is the processing orchestrator that converts, protects, and generates media across 125+ formats with 30+ AI models. It dispatches work to specialized GPU and CPU workers, autoscales your fleet, and delivers results through webhooks, WebSockets, and message queues. Now with MCP support -- drive Forge directly from Claude or any AI assistant using natural language.

The Orchestration Engine

Forge doesn't process media -- it orchestrates a fleet of specialized workers. Upload any file, Forge routes it to the right worker via a message queue, manages the queue, scales the fleet, and notifies you the moment it's done.

Forge orchestration architecture with specialized worker pools connected by glowing data streams

Worker Orchestration

Routes jobs to specialized Video, Image, 3D, GenAI, and Document workers via dedicated progress queues. Each worker type is purpose-built for its domain.

Batch Processing

Submit up to 100 mixed-type files in a single API request. Forge decomposes the batch, routes each file to the correct worker, and reassembles results.

Cloud Autoscaling

Stateless scaling engine with a 5-second polling loop. Per-group scaling policies, cloud-native API calls, and built-in cost tracking.

Triple Notifications

Cryptographically signed webhooks, WebSocket push, and message queue delivery. Choose one or all three -- never miss a completion event.

API Security

API key authentication with cryptographic hashing, per-key rate limiting, and idempotency keys. Every request is authenticated, throttled, and deduplicated.

PWA Admin Dashboard

Installable progressive web app for monitoring jobs, managing webhooks, browsing S3, tracking costs, and configuring workers.

Studio-Grade Processing

From EXR film sequences to AI-generated video, Forge's internal workers handle every media type at production quality. ACES color science. GPU encoding. Turntable rendering. This is not consumer transcoding.

GPU-accelerated media processing pipeline with amber glowing cores and ACES color grading

Frame Sequences to Playable Video

EXR and DPX frame sequence assembly into H.264, H.265, and ProRes
GPU hardware acceleration for hardware-accelerated output
Watermark overlay injection during encode (visual + forensic)
Configurable output resolution, bitrate, codec, and container
Progress reporting with frame-level granularity

EXR Frames 0001-2400

Forge Video Worker

GPU Accelerated

ACES Color Pipeline

H.265 MP4

100+ Formats. Color Science Built In.

100+ input formats with multi-engine processing and intelligent fallback chains
ACES and OCIO color management for accurate color transforms
HDR processing (EXR, HDR, DPX, Radiance)
30+ RAW camera formats: Canon CR2/CR3, Nikon NEF/NRW, Sony ARW, Fuji RAF, Phase One IIQ, Hasselblad 3FR, Leica RWL, Pentax PEF, DNG, and more
Forensic and visual watermarking
Batch resize, format conversion, and metadata preservation

Supported Image Formats

JPG

PNG

TIFF

WebP

HEIC

AVIF

EXR

DPX

PSD

CR2

CR3

NEF

ARW

RAF

DNG

IIQ

3FR

RWL

PEF

GIF

BMP

SVG

ICO

TGA

+ 76 more formats supported

31+ Formats. Turntable Previews Included.

31+ 3D format support: USD/USDA/USDC/USDZ, Alembic (ABC), FBX, glTF/GLB, OBJ, STL, Blender, DAE, PLY, 3DS, and more
Automated turntable rendering -- automatically generates a rotating video preview of any 3D asset
Camera intelligence: automatic framing and lighting based on model bounding box
Format conversion pipeline for format-to-format transforms
Scene complexity analysis and polygon count reporting

3D model processing visualization showing USD and FBX file conversion with wireframe rendering and turntable preview generation

30+ Models. One API Call.

Generate images, video, audio, and 3D assets from text prompts. Analyze visual content. Maintain character consistency with reference images. All through the same Forge API you use for everything else.

Generative AI processing pipeline with neural network visualization and creative content generation

Image Generation

Imagen 3.0 (Google)
Imagen 3.0 Fast
DALL-E 3 (OpenAI)
DALL-E 3 HD
GPT-Image 1.0
GPT-Image 1.5
Stable Diffusion 3.5
SDXL (Stability)
Flux 1.1 Pro (BFL)
Flux 1.1 Ultra
Recraft V3
Ideogram V2
Ideogram V2 Turbo

Video Generation

Veo 3.0 (Google)
Veo 3.5 (Google)

Audio & Speech

ElevenLabs Multilingual v2
ElevenLabs Flash v2.5
OpenAI TTS
OpenAI TTS HD

Vision & 3D

GPT-4o Vision (OpenAI)
Claude Vision (Anthropic)
Gemini Vision (Google)
Meshy (3D Gen)
Tripo (3D Gen)

Character Consistency at Scale

Forge's reference image system lets you maintain visual consistency across generated outputs. Upload reference images and specify how they should influence generation.

Subject Reference

Maintain a character's appearance across multiple generated images. Upload a face or character reference and generate them in different poses, settings, and styles.

Style Reference

Apply the visual style of a reference image to new generations. Match lighting, color grading, artistic style, or photographic look.

Control Reference

Use structural references (depth maps, edge maps, pose skeletons) to control the composition and layout of generated images.

Office and PDF Processing

PDF rendering with page-level extraction
DOCX, PPTX, XLSX conversion to PDF or image
OCR for scanned documents
Metadata extraction and indexing

Document processing pipeline showing PDF and Office document analysis with OCR text recognition and metadata extraction

Format Support

125+ formats across every media type. Search or browse by category.

Diverse media file formats including EXR, USD, RAW, and ProRes organized with amber accent lighting

30+ AI Models. One API Call.

Image Generation

Imagen 3.0 (Google)
Imagen 3.0 Fast Fast
DALL-E 3 (OpenAI)
DALL-E 3 HD HD
GPT-Image 1.0 (OpenAI) Reference Images
GPT-Image 1.5 Reference Images
Stable Diffusion 3.5 (Stability)
SDXL (Stability)
Flux 1.1 Pro (BFL)
Flux 1.1 Ultra HD
Recraft V3
Ideogram V2
Ideogram V2 Turbo Fast

Video Generation

Veo 3.0 (Google)
Veo 3.5 (Google)

Audio & Speech

ElevenLabs Multilingual v2 Multilingual
ElevenLabs Flash v2.5 Fast
OpenAI TTS
OpenAI TTS HD HD

Vision & Analysis

GPT-4o Vision (OpenAI)
Claude Vision (Anthropic)
Gemini Vision (Google)

3D Generation

Meshy
Tripo

Character Consistency at Scale

Forge's reference image system lets you maintain visual consistency across generated outputs. Upload reference images and specify how they should influence generation.

Subject Reference

Maintain a character's appearance across multiple generated images. Upload a face or character reference and generate them in different poses, settings, and styles.

Style Reference

Apply the visual style of a reference image to new generations. Match lighting, color grading, artistic style, or photographic look.

Control Reference

Use structural references (depth maps, edge maps, pose skeletons) to control the composition and layout of generated images.

{
  "model": "gpt-image-1.5",
  "prompt": "Character standing in a medieval castle courtyard, dramatic lighting",
  "reference_images": [
    {
      "url": "s3://refs/character-face.jpg",
      "mode": "subject",
      "weight": 0.8
    },
    {
      "url": "s3://refs/cinematic-style.jpg",
      "mode": "style",
      "weight": 0.6
    }
  ],
  "output": {"format": "png", "width": 2048, "height": 2048}
}

The Processing Pipeline

Forge orchestrates a distributed architecture of specialized workers. From API request to delivered result, every component is designed for scale, reliability, and observability.

Forge orchestrates a distributed fleet of specialized workers, each optimized for its domain.

GPU Fleet That Scales Itself

Forge's autoscaling engine monitors queue depth every 5 seconds, launches GPU instances when work arrives, and terminates them when queues drain. Per-group policies. Built-in cost tracking.

Isometric GPU computing cluster with autoscaling instances and amber glow

Queue Depth Instance Count

5-second polling loop Per-group policies Cloud-native

Stateless engine -- no single point of failure, any node can run the scaler
Per-group scaling policies (e.g., GPU instances for video, CPU instances for documents)
Queue-depth-based triggers with configurable thresholds
Built-in cost tracking with per-job cost attribution
Warm pool support for instant capacity during peak loads
Graceful drain: in-flight jobs complete before instance termination

Know the Moment It's Done

Three independent notification channels ensure your pipeline never misses a processing event. Every webhook is cryptographically signed. Every API key is cryptographically hashed.

Webhooks

HMAC-SHA256 signed payloads. Configurable retry with exponential backoff. Delivery history and replay. Signature verification SDK.

WebSocket

WebSocket real-time push. Job progress, completion, and error events. Connection auto-recovery. The admin dashboard uses this channel.

Message Queue

Message queue delivery. Dedicated progress queues per worker type. Durable, persistent, acknowledgment-based. For backend-to-backend integration.

Security

API Key Auth

Cryptographic hashing at rest, per-key rate limiting

Idempotency Keys

Prevent duplicate processing with request deduplication

TLS 1.3+

End-to-end encryption for all API traffic

AES-256 Storage

Encryption at rest for all stored media assets

Rate Limiting

Configurable burst and sustained rates per key

Protect Every Asset

Forensic watermarking for leak tracing. Visual watermarking for WIP deterrence. Apply both simultaneously. Survive compression, cropping, and format conversion.

Original

No visible marks

Protected

Forensic + Visual

Encode Compress Crop Forensic mark survives

Forensic watermarking: imperceptible payload embedded in pixel data, survives common transforms
Visual watermarking: configurable text/image overlay with position, opacity, scale, rotation
Template system: save and reuse watermark configurations across projects
Dual mode: apply forensic + visual simultaneously in a single processing pass
Frame-level watermarking for video sequences
Extraction and verification API for forensic mark recovery

Tell Claude "generate a sunset image" or "convert these EXR frames to MP4" and Forge handles the rest. Create media jobs, batch operations, and AI generations through conversation.

Watermark & Secure

Apply forensic and visual watermarks through simple prompts. "Watermark this video with our studio logo" -- Forge applies it with full template and positioning control.

Browse & Retrieve

Browse S3 storage, list recent jobs, check batch status, and download results -- all from within your AI assistant. No dashboard switching required.

10 MCP Tools Available

create_media_job -- Generate or process media

create_batch_jobs -- Submit multiple jobs at once

watermark_media -- Apply text or image watermarks

browse_s3 -- Browse files in S3 storage

get_job_status -- Check job progress

get_job_result -- Get outputs with download URLs

get_batch_status -- Check all jobs in a batch

get_signed_url -- Get temporary download links

list_jobs -- List and filter recent jobs

wait_for_job -- Wait for job completion

One API. Every Media Type. Any Scale.

The Orchestration Engine

Worker Orchestration

Batch Processing

Cloud Autoscaling

Triple Notifications

API Security

PWA Admin Dashboard

Studio-Grade Processing

Frame Sequences to Playable Video

100+ Formats. Color Science Built In.

Supported Image Formats

31+ Formats. Turntable Previews Included.

30+ Models. One API Call.

Image Generation

Video Generation

Audio & Speech

Vision & 3D

Character Consistency at Scale

Subject Reference

Style Reference

Control Reference

Office and PDF Processing

Format Support

30+ AI Models. One API Call.

Image Generation

Video Generation

Audio & Speech

Vision & Analysis

3D Generation

Character Consistency at Scale

Subject Reference

Style Reference

Control Reference

The Processing Pipeline

GPU Fleet That Scales Itself

Know the Moment It's Done

Webhooks

WebSocket

Message Queue

Security

Protect Every Asset

What Forge Makes Possible

Before

After

Before

After

Before

After

Before

After

Before

After

MCP Integration: AI-Driven Media Workflows

Natural Language Processing

Watermark & Secure

Browse & Retrieve

10 MCP Tools Available

Ready to See Forge in Action?