Skip to content

Video duration support #534

@tombeckenham

Description

@tombeckenham

Video duration support

TanStack AI supports a number as duration. We then include a per adapter function that validates the duration that is passed. The problem is that duration can be as complex as size is for images. Many models support strings such as "6s", and "auto" for durations. There are some that supply a range.

Problem

It's really hard for the consumer to know what durations are supported. Validating the duration at runtime doesn't help as it implies that all the logic to determine if a duration is supported.

Possible Solutions

The current TanStack AI system relies on typescript for model option validation, and I believe we should do the same for durations. We modify the duration field to be a template type, and allow adapters to define on a per model basis. This would prevent the user from passing an arbitrary number to the adapter. This is the first thing we should do and it would be a breaking change.

We should then provide several functions on the adapter

  • availableDurations - lists the available durations in some sort of structure
  • snapDuration - given seconds as input, return the closest matching valid duration

Design Decisions (2026-05-25)

Scope

This issue ships the typed-duration contract + availableDurations / snapDuration introspection helpers, applied to the two video adapters that exist today: FAL (lead, widest variety) and OpenAI Sora.

A Google Gemini Veo adapter is filed as a follow-up issue. There is no Veo adapter today — the Veo model entries in packages/typescript/ai-gemini/src/model-meta.ts are commented out (lines 827–940) and packages/typescript/ai-gemini/src/adapters/ has no video.ts. Building Veo after this PR lands means it ships directly on the new contract and never breaks.

availableDurations return shape

Tagged union, with a 'none' case for models that don't accept a duration (e.g. Minimax):

type DurationOptions<T extends string | number | undefined> =
  | { kind: 'discrete'; values: ReadonlyArray<NonNullable<T>> }
  | { kind: 'range'; min: number; max: number; step?: number; unit: 'seconds' }
  | { kind: 'mixed'; values: ReadonlyArray<NonNullable<T>>; range?: { min: number; max: number; step?: number } }
  | { kind: 'none' }

duration field typing — strict, no number escape hatch

generateVideo({ duration }) accepts only the model-specific union, via VideoDurationForAdapter<TAdapter>. Examples:

  • openaiVideo('sora-2')duration?: '4' | '8' | '12'
  • falVideo('fal-ai/kling-video/v1.6/standard/text-to-video')duration?: '5' | '10'
  • falVideo('fal-ai/veo3')duration?: '4s' | '6s' | '8s'
  • falVideo('fal-ai/minimax/video-01') → field rejected (typed as never)

Consumers who have raw seconds (e.g. from a UI slider) coerce explicitly:

const adapter = falVideo('fal-ai/veo3')
await generateVideo({ adapter, duration: adapter.snapDuration(7), prompt })  // '6s'

snapDuration(seconds) semantics

  • kind: 'none'undefined
  • kind: 'discrete' → parse numeric-shaped entries ('5' → 5, '8s' → 8, 5 → 5), return the original-form entry whose numeric value is closest to seconds. For keyword-only sets (no numeric values), return values[0].
  • kind: 'range' → clamp to [min, max], round to step (default 1).
  • kind: 'mixed' → closest of (discrete numerics ∪ range values).

FAL per-model duration table (curated)

Model Duration shape Category
fal-ai/kling-video/v1.6/{standard,pro}/text-to-video '5' | '10' discrete
fal-ai/pika/v2.2/text-to-video '5' | '10' discrete
fal-ai/luma-dream-machine/ray-2 '5s' | '9s' discrete (keyword)
fal-ai/veo3 and fal-ai/veo3/image-to-video '4s' | '6s' | '8s' discrete (keyword)
fal-ai/wan-25-preview/text-to-video '2' … '15' discrete (range-as-enum)
fal-ai/minimax/video-01 none
fal-ai/hunyuan-video-v1.5/text-to-video uses num_frames instead none (documented in modelOptions)

For FAL models outside the curated map, the type-level shape is still derived from @fal-ai/client's EndpointTypeMap, so autocomplete works; the runtime availableDurations returns { kind: 'none' } as the honest "we don't know" answer.

Dependency on @tanstack/ai-schemas (#622)

This PR is built on top of #622. The OPENAI_VIDEO_DURATIONS and FAL duration data are sourced from the generated Zod schemas where available, with hand-curated fallbacks for what the schemas pipeline doesn't yet cover (notably FAL, which requires FAL_KEY to sync). When #622 merges into main, this PR will rebase + retarget to main.

Breaking change

Any caller currently passing duration: <number> will need to update — either pass the typed string union directly, or call adapter.snapDuration(seconds). Bumped as a major for @tanstack/ai, @tanstack/ai-fal, @tanstack/ai-openai.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request
No fields configured for Enhancement.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions