From Script to Pipeline-Ready Data
Codex breaks down screenplays, production documents, and editorial timelines into structured data your pipeline can act on. Built on intelligent parsers — not just an LLM with a prompt — so the breakdown is something you can actually trust to drive your budget, schedule, and casting.
Where Codex Fits
Codex is the entry point for narrative and document data — feeding structured production knowledge into the rest of the pipeline.
Drop a script in. Codex parses scenes, identifies entities, classifies elements, and writes the result back to Conductor's productions database and Fabric's metadata graph — ready to drive every downstream action.
Breakdown You Can Trust
The breakdown drives the budget, the schedule, and the cast. It can't be approximate.
Pure-LLM Breakdown Fails
Ask an LLM to break down a 120-page screenplay and you'll get hallucinated characters, missed slug lines, dialogue attributed to the wrong speaker, and silent failures on revision marks. Fine for a demo. A non-starter for production work where the breakdown is contractual input to budget and schedule.
Codex Uses Intelligent Parsers First
Codex starts with format-specific parsers built around each spec — Final Draft's FDX schema, the FDX standard, ALE/AAF grammars, OTIO's data model, screenplay structural rules. Deterministic logic does the work where structure exists. AI only steps in for the ambiguous parts: classifying borderline props, inferring locations from context, or learning your custom element categories from a few tagged examples.
The result: measurably higher accuracy than pure-LLM tools, with confidence scores and source-page references on every extracted element — so when a producer questions the count, you can show your work.
Four Things Codex Does Well
Built for productions that need their script knowledge to be machine-readable.
Intelligent Script Breakdown Flagship
Codex's flagship capability. Upload a screenplay in PDF, FDX, or Final Draft and Codex returns a structured breakdown — every scene, every speaking character, every prop, every location — with confidence scores and source-page references. Structural parsers handle the deterministic work (slug lines, scene structure, character extraction); AI is layered on top only for the genuinely ambiguous calls.
- Scene-by-scene parsing with INT/EXT, time-of-day, and slug detection — structural, not guessed
- Speaking character extraction with line counts per scene
- Prop, vehicle, and wardrobe identification with AI-assisted classification
- Page references and confidence scores on every extracted element
Document Ingest
Beyond scripts: ingest treatments, lookbooks, call sheets, one-liners, and shot lists. Codex normalizes them into the same structured schema so production documents stop being orphaned PDFs.
- PDF, DOCX, FDX, plain text, and structured CSV input
- Editorial & post: ALE, EDL, AAF, OTIO, FCP/Premiere/DaVinci XML, CDL, OMF
- Auto-detect document type (script, call sheet, shot list, treatment, timeline)
- Per-document confidence and validation report
- Re-ingest on revision with diff against prior version
Configurable Element Categories
The standard breakdown categories (cast, extras, vehicles, props, wardrobe, makeup, set dressing, animals, stunts, SFX) ship out of the box. Add your own — Codex learns from a handful of tagged examples.
- Pre-built categories aligned to MPAA / industry standards
- Custom categories per production or per studio
- Few-shot tuning — examples in, model out
- Per-category confidence thresholds
Search, Diff & Export
Once parsed, every entity is searchable. Find every scene a character appears in. See what changed between draft 3 and draft 4. Export to FDX, CSV, or directly into ShotGrid via Fabric.
- Full-text + structured search across all ingested documents
- Draft-to-draft diff showing scene-level changes
- Export targets: FDX, CSV, JSON, OTIO, EDL, ShotGrid sync
- Webhook on parse completion for pipeline automation
Inside a Codex Breakdown
What you actually get back when Codex finishes parsing a screenplay.
For every scene, Codex returns the slug line, location, time of day, page range, speaking characters with line counts, and every detected element grouped by category — props, vehicles, set dressing, wardrobe, SFX, stunts.
Each element carries a confidence score and the source page where it was found. Below your configured threshold, items go to a review queue rather than the production database — never silently wrong, always traceable.
A "document fingerprint" tracks revisions: when draft 4 lands, you see exactly which scenes changed, which characters were added or removed, and which props moved between scenes.
SCENE 14 INT. WAREHOUSE - NIGHT page: 22-24 confidence: 0.96 characters: - SARAH (12 lines) - DET. KIM (4 lines) - GUARD (1 line, non-speaking flag) props: - briefcase page 22 conf 0.94 - flashlight page 23 conf 0.88 - service weapon page 24 conf 0.91 vehicles: - 1998 Ford Crown Vic page 22 conf 0.79 sfx: - flickering overhead page 23 conf 0.82
What Codex Talks To
Codex doesn't live alone — it writes into the systems your production already uses.
PDF / DOCX
OCR fallback for image-only PDFs. Handles standard screenplay formatting, treatments, and call sheets.
Final Draft (FDX)
Native FDX parser preserves Final Draft's scene structure, dialogue, action lines, and revision marks.
Conductor
Breakdowns are written directly into the production database. Characters become production entities. Scenes drive ingest stages.
Fabric
Extracted entities feed Fabric's metadata graph, where they're normalized to MovieLabs OMC 2030 and joined against ShotGrid records.
ShotGrid
Two-way sync via Fabric: Codex creates ShotGrid scenes and elements, ShotGrid status updates flow back to the breakdown.
Webhooks
Fire on parse-complete, on revision-detected, or per-category threshold cross. Connect anything that speaks HTTP.
Editorial & Post Formats
Codex doesn't stop at the script. It speaks the formats your editors, colorists, and sound team already use — so the breakdown stays linked to the cut, the color, and the mix.
ALE
Avid Log Exchange — dailies metadata with scene, take, roll, and source clip mappings. Reconciles to script breakdowns so every scene's coverage is linked to its production data.
EDL
CMX 3600 edit decision lists. Parses cut order, source timecodes, and transitions — stitching editorial timing back to scenes in the breakdown.
AAF
Avid Advanced Authoring Format. Reads project bins, clip relationships, and edited sequences. Far richer than EDL — captures the full editorial state Avid customers actually work in.
OTIO
OpenTimelineIO from Pixar — the emerging open standard for timeline interchange, aligned with MovieLabs 2030 OMC. Future-proof across Avid, Premiere, Resolve, Flame, and beyond.
FCP / Premiere / DaVinci XML
Three flavors of timeline XML covering the non-Avid editorial world. FCPXML from Apple, Premiere XML from Adobe, and DaVinci Resolve's FCP7-style XML — all parsed natively.
CDL
ASC Color Decision List — slope, offset, power, and saturation values from the colorist. Travels with EDLs and OTIO timelines so the color story stays connected end to end.
OMF
Open Media Framework — mixed audio sessions exported from Pro Tools and Logic. Embeds clip metadata, track layout, and basic effects state for sound post.
Built for Pre-Release Material
Scripts are some of the most sensitive documents on a production. Codex treats them that way.
On-Premise Option
For studios with mandate against cloud LLMs, Codex runs against on-prem models with the same extraction pipeline. Scripts never leave your network.
RBAC + Audit
Every upload, every parse, every entity edit is audit-logged. Per-production access control inherits from Conductor.
Watermark & Provenance
Optional invisible watermarking on script PDFs through the Codex pipeline. Per-recipient watermarks so leaks are traceable.
Stop Re-Typing What's Already in the Script
Drop us a draft. We'll send back a structured breakdown so you can see exactly what Codex finds in your material.