> ## Documentation Index
> Fetch the complete documentation index at: https://runpod-b18f5ded-promptless-websocket-streaming-tutorial.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Model reference

> Explore model-specific parameters for Runpod's Public Endpoints.

This page lists all available models for Runpod Public Endpoints, as well as the model-specific parameters you can use in your API calls. You can browse and test Public Endpoints using the [Runpod console](https://console.runpod.io/hub?tabSelected=public_endpoints).

<Warning>
  Output URLs (`image_url`, `video_url`, and `audio_url`) expire after 7 days. Download and store your generated files immediately if you need to keep them longer.
</Warning>

## Available models

The following models are currently available:

| Model                       | Description                                                                                                                                 | Endpoint URL                                                     | Type  | Price                                           |
| --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | ----- | ----------------------------------------------- |
| **IBM Granite-4.0-H-Small** | A 32B parameter long-context instruct model.                                                                                                | `https://api.runpod.ai/v2/granite-4-0-h-small/`                  | Text  | \$0.01 per 1000 tokens                          |
| **Qwen3 32B AWQ**           | The latest LLM in the Qwen series, offering advancements in reasoning, instruction-following, agent capabilities, and multilingual support. | `https://api.runpod.ai/v2/qwen3-32b-awq/`                        | Text  | \$0.01 per 1000 tokens                          |
| **Flux Dev**                | Offers exceptional prompt adherence, high visual fidelity, and rich image detail.                                                           | `https://api.runpod.ai/v2/black-forest-labs-flux-1-dev/`         | Image | \$.02 per megapixel                             |
| **Flux Schnell**            | Fastest and most lightweight FLUX model, ideal for local development, prototyping, and personal use.                                        | `https://api.runpod.ai/v2/black-forest-labs-flux-1-schnell/`     | Image | \$.0024 per megapixel                           |
| **Flux Kontext Dev**        | A 12 billion parameter rectified flow transformer capable of editing images based on text instructions.                                     | `https://api.runpod.ai/v2/black-forest-labs-flux-1-kontext-dev/` | Image | \$0.03 per megapixel                            |
| **Qwen Image**              | Image generation foundation model with advanced text rendering.                                                                             | `https://api.runpod.ai/v2/qwen-image-t2i/`                       | Image | \$0.02 per megapixel                            |
| **Qwen Image LoRA**         | Image generation with LoRA support and advanced text rendering.                                                                             | `https://api.runpod.ai/v2/qwen-image-t2i-lora/`                  | Image | \$0.02 per megapixel                            |
| **Qwen Image Edit**         | Image editing with unique text rendering capabilities.                                                                                      | `https://api.runpod.ai/v2/qwen-image-edit/`                      | Image | \$0.02 per megapixel                            |
| **Seedream 4.0 T2I**        | New-generation image creation with unified generation and editing architecture.                                                             | `https://api.runpod.ai/v2/seedream-v4-t2i/`                      | Image | \$0.027 per megapixel                           |
| **Seedream 4.0 Edit**       | New-generation image editing with unified generation and editing architecture.                                                              | `https://api.runpod.ai/v2/seedream-v4-edit/`                     | Image | \$0.027 per megapixel                           |
| **Seedream 3.0**            | Native high-resolution bilingual image generation (Chinese-English).                                                                        | `https://api.runpod.ai/v2/seedream-3-0-t2i/`                     | Image | \$0.03 per megapixel                            |
| **Nano Banana Edit**        | Google's state-of-the-art image editing model.                                                                                              | `https://api.runpod.ai/v2/nano-banana-edit/`                     | Image | \$0.027 per megapixel                           |
| **InfiniteTalk**            | Audio-driven video generation model that creates talking or singing videos from a single image and audio input.                             | `https://api.runpod.ai/v2/infinitetalk/`                         | Video | \$0.25 per video generation                     |
| **Kling v2.1 I2V Pro**      | Professional-grade image-to-video with enhanced visual fidelity.                                                                            | `https://api.runpod.ai/v2/kling-v2-1-i2v-pro/`                   | Video | \$0.36 per 5 seconds of video                   |
| **Seedance 1.0 Pro**        | High-performance video generation with multi-shot storytelling.                                                                             | `https://api.runpod.ai/v2/seedance-1-0-pro/`                     | Video | \$0.62 per 5 seconds of video                   |
| **SORA 2 I2V**              | OpenAI's Sora 2 is a video and audio generation model.                                                                                      | `https://api.runpod.ai/v2/sora-2-i2v/`                           | Video | \$0.40 per video generation                     |
| **SORA 2 Pro I2V**          | OpenAI's Sora 2 Pro is a professional-grade video and audio generation model.                                                               | `https://api.runpod.ai/v2/sora-2-pro-i2v/`                       | Video | \$1.20 per video generation                     |
| **WAN 2.5**                 | Image-to-video generation model.                                                                                                            | `https://api.runpod.ai/v2/wan-2-5/`                              | Video | \$0.50 per 5 seconds of video                   |
| **WAN 2.2 I2V 720p LoRA**   | Open-source video generation with LoRA support.                                                                                             | `https://api.runpod.ai/v2/wan-2-2-t2v-720-lora/`                 | Video | \$0.35 per 5 seconds of video                   |
| **WAN 2.2 I2V 720p**        | Open-source AI video generation model that uses a diffusion transformer architecture for image-to-video generation.                         | `https://api.runpod.ai/v2/wan-2-2-i2v-720/`                      | Video | \$0.30 per 5 seconds of video                   |
| **WAN 2.2 T2V 720p**        | Open-source AI video generation model that uses a diffusion transformer architecture for text-to-video generation.                          | `https://api.runpod.ai/v2/wan-2-2-t2v-720/`                      | Video | \$0.30 per 5 seconds of video                   |
| **WAN 2.1 I2V 720p**        | Open-source AI video generation model that uses a diffusion transformer architecture for image-to-video generation.                         | `https://api.runpod.ai/v2/wan-2-1-i2v-720/`                      | Video | \$0.30 per 5 seconds of video                   |
| **WAN 2.1 T2V 720p**        | Open-source AI video generation model that uses a diffusion transformer architecture for text-to-video generation.                          | `https://api.runpod.ai/v2/wan-2-1-t2v-720/`                      | Video | \$0.30 per 5 seconds of video                   |
| **Kling v2.1 I2V Pro**      | Professional-grade image-to-video with enhanced visual fidelity.                                                                            | `https://api.runpod.ai/v2/kling-v2-1-i2v-pro/`                   | Video | \$0.36 per 5 seconds of video                   |
| **Whisper V3 Large**        | State-of-the-art automatic speech recognition.                                                                                              | `https://api.runpod.ai/v2/whisper-v3-large/`                     | Audio | \$0.05 per 1000 characters of audio transcribed |
| **Minimax Speech 02 HD**    | High-definition text-to-speech model.                                                                                                       | `https://api.runpod.ai/v2/minimax-speech-02-hd/`                 | Audio | \$0.05 per 1000 characters of audio generated   |

## Model-specific parameters

Each Public Endpoint accepts a different set of parameters to control the generation process.

### Flux Dev

Flux Dev is optimized for high-quality, detailed image generation. The model accepts several parameters to control the generation process:

```json theme={null}
{
  "input": {
    "prompt": "A serene mountain landscape at sunset",
    "negative_prompt": "Snow",
    "width": 1024,
    "height": 1024,
    "num_inference_steps": 20,
    "guidance": 7.5,
    "seed": 42,
    "image_format": "png"
  }
}
```

| Parameter             | Type    | Required | Default | Range           | Description                                                                                  |
| --------------------- | ------- | -------- | ------- | --------------- | -------------------------------------------------------------------------------------------- |
| `prompt`              | string  | Yes      | -       | -               | Text description of the desired image.                                                       |
| `negative_prompt`     | string  | No       | -       | -               | Elements to exclude from the image.                                                          |
| `width`               | integer | No       | 1024    | 256-1536        | Image width in pixels. Must be divisible by 64.                                              |
| `height`              | integer | No       | 1024    | 256-1536        | Image height in pixels. Must be divisible by 64.                                             |
| `num_inference_steps` | integer | No       | 28      | 1-50            | Number of denoising steps.                                                                   |
| `guidance`            | float   | No       | 7.5     | 0.0-10.0        | How closely to follow the prompt.                                                            |
| `seed`                | integer | No       | -1      | -               | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
| `image_format`        | string  | No       | "jpeg"  | "png" or "jpeg" | Output format.                                                                               |

### Flux Schnell

Flux Schnell is optimized for speed and real-time applications:

```json theme={null}
{
  "input": {
    "prompt": "A quick sketch of a mountain",
    "width": 1024,
    "height": 1024,
    "num_inference_steps": 4,
    "guidance": 1.0,
    "seed": 123
  }
}
```

| Parameter             | Type    | Required | Default | Range           | Description                                                                                  |
| --------------------- | ------- | -------- | ------- | --------------- | -------------------------------------------------------------------------------------------- |
| `prompt`              | string  | Yes      | -       | -               | Text description of the desired image.                                                       |
| `negative_prompt`     | string  | No       | -       | -               | Elements to exclude from the image.                                                          |
| `width`               | integer | No       | 1024    | 256-1536        | Image width in pixels. Must be divisible by 64.                                              |
| `height`              | integer | No       | 1024    | 256-1536        | Image height in pixels. Must be divisible by 64.                                             |
| `num_inference_steps` | integer | No       | 4       | 1-8             | Number of denoising steps.                                                                   |
| `guidance`            | float   | No       | 7.5     | 0.0-10.0        | How closely to follow the prompt.                                                            |
| `seed`                | integer | No       | -1      | -               | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
| `image_format`        | string  | No       | "jpeg"  | "png" or "jpeg" | Output format.                                                                               |

<Warning>
  Flux Schnell is optimized for speed and works best with lower step counts. Using higher values may not improve quality significantly.
</Warning>

### IBM Granite-4.0-H-Small

IBM Granite-4.0-H-Small is a 32B parameter long-context instruct model.

```json theme={null}
{
  "input": {
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant. Please ensure responses are professional, accurate, and safe."
      },
      {
        "role": "user",
        "content": "What is Runpod?"
      }
    ],
    "sampling_params": {
      "max_tokens": 512,
      "temperature": 0.7,
      "seed": -1,
      "top_k": -1,
      "top_p": 1
    }
  }
}
```

| Parameter                     | Type    | Required | Default | Range   | Description                                                                        |
| ----------------------------- | ------- | -------- | ------- | ------- | ---------------------------------------------------------------------------------- |
| `messages`                    | array   | Yes      | -       | -       | Array of message objects with role and content.                                    |
| `sampling_params.max_tokens`  | integer | No       | 512     | -       | Maximum number of tokens to generate.                                              |
| `sampling_params.temperature` | float   | No       | 0.7     | 0.0-1.0 | Controls randomness in generation. Lower values make output more deterministic.    |
| `sampling_params.seed`        | integer | No       | -1      | -       | Seed for reproducible results. The default value (-1) will generate a random seed. |
| `sampling_params.top_k`       | integer | No       | -1      | -       | Restricts sampling to the top K most probable tokens.                              |
| `sampling_params.top_p`       | float   | No       | 1       | 0.0-1.0 | Nucleus sampling threshold.                                                        |

### Qwen3 32B AWQ

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.

| Parameter     | Type    | Required | Default | Range     | Description                                                                                        |
| ------------- | ------- | -------- | ------- | --------- | -------------------------------------------------------------------------------------------------- |
| `prompt`      | string  | Yes      | -       | -         | Prompt for text generation.                                                                        |
| `max_tokens`  | integer | No       | 512     | -         | Maximum number of tokens to output.                                                                |
| `temperature` | float   | No       | 0.7     | 0.0 - 1.0 | Randomness of the output. Lower temperature makes the output more predictable and deterministic.   |
| `top_p`       |         | integer  | No      | -         | Samples from the smallest set of words whose cumulative probability exceeds a given threshold (P). |
| `top_k`       | integer | No       | -       | 1-8       | Restricts sampling to the top K most probable words.                                               |
| `stop`        | string  | No       | -       | -         | Stops generation if the given string is encountered.                                               |

The Qwen3 endpoint is also fully compatible with vLLM and the OpenAI API, allowing you to use any of the parameters available in these frameworks. For more details, see [Send vLLM requests](/serverless/vllm/vllm-requests) and the [OpenAI API compatibility guide](/serverless/vllm/openai-compatibility).

Here are some examples of how to use the Qwen3 32B AWQ model with the OpenAI API:

<AccordionGroup>
  <Accordion title="OpenAI API request example">
    ```python theme={null}
    from openai import OpenAI
    import os

    PUBLIC_ENDPOINT_ID = "qwen3-32b-awq"
    model_name = "Qwen/Qwen3-32B-AWQ"

    client = OpenAI(
        api_key=RUNPOD_API_KEY,
        base_url=f"https://api.runpod.ai/v2/{PUBLIC_ENDPOINT_ID}/openai/v1",
    )
    messages = [
        {
            "role": "system",
            "content": "You are a pirate chatbot who always responds in pirate speak!",
        },
        {   "role": "user", 
            "content": "Give me a short introduction to LLMs."
        },
    ]

    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        max_tokens=525,
    )
    ```
  </Accordion>

  <Accordion title="OpenAI API streaming example">
    You can stream responses from the OpenAI API using the `stream` and `stream_options` parameters:

    ```python theme={null}
    from openai import OpenAI
    import os

    PUBLIC_ENDPOINT_ID = "qwen3-32b-awq"
    model_name = "Qwen/Qwen3-32B-AWQ"

    client = OpenAI(
        api_key=RUNPOD_API_KEY,
        base_url=f"https://api.runpod.ai/v2/{PUBLIC_ENDPOINT_ID}/openai/v1",
    )
    messages = [
        {
            "role": "system",
            "content": "You are a pirate chatbot who always responds in pirate speak!",
        },
        {   "role": "user", 
            "content": "Give me a short introduction to LLMs."
        },
    ]

    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        max_tokens=525,
        stream=True
    )
    ```
  </Accordion>

  <Accordion title="Response format">
    ```json theme={null}
    {
      "delayTime": 25,
      "executionTime": 3153,
      "id": "sync-0f3288b5-58e8-46fd-ba73-53945f5e8982-u2",
      "output": [
        {
          "choices": [
            {
              "tokens": [
                "Large Language Models (LLMs) are AI systems trained to predict and understand human language. They learn patterns from vast amounts of text data, enabling them to generate responses, answer questions, and complete tasks in natural language. Key characteristics of LLMs include:\n1. Language Understanding\n- Can analyze and comprehend language structure, context, and nuances\n- Process both inputs and outputs in natural human language\n\n2. Pattern Recognition\n- Learn common phrases and relationships"
              ]
            }
          ],
          "cost": 0.0001,
          "usage": {
            "input": 10,
            "output": 100
          }
        }
      ],
      "status": "COMPLETED",
      "workerId": "pkej0t9bbyjrgy"
    }
    ```
  </Accordion>
</AccordionGroup>

### Qwen Image

Qwen Image is an image generation foundation model with advanced text rendering capabilities.

```json theme={null}
{
  "input": {
    "prompt": "A fashion-forward woman sitting at cobblestone street in Paris",
    "negative_prompt": "",
    "size": "1328*1328",
    "seed": -1,
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default      | Description                                                |
| ----------------------- | ------- | -------- | ------------ | ---------------------------------------------------------- |
| `prompt`                | string  | Yes      | -            | Text description of the desired image.                     |
| `negative_prompt`       | string  | No       | -            | Elements to exclude from the image.                        |
| `size`                  | string  | No       | "1024\*1024" | Image dimensions.                                          |
| `seed`                  | integer | No       | -1           | Seed for reproducible results. -1 generates a random seed. |
| `enable_safety_checker` | boolean | No       | true         | Enable content safety checking.                            |

### Qwen Image LoRA

Qwen Image with LoRA support allows you to customize generation with fine-tuned LoRA models.

```json theme={null}
{
  "input": {
    "prompt": "Real life Anime in a cozy kitchen",
    "loras": [
      {
        "path": "https://huggingface.co/flymy-ai/qwen-image-anime-irl-lora/resolve/main/flymy_anime_irl.safetensors",
        "scale": 1
      }
    ],
    "size": "1024*1024",
    "seed": -1,
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default      | Description                                                |
| ----------------------- | ------- | -------- | ------------ | ---------------------------------------------------------- |
| `prompt`                | string  | Yes      | -            | Text description of the desired image.                     |
| `loras`                 | array   | No       | \[]          | Array of LoRA configurations to apply.                     |
| `loras[].path`          | string  | Yes      | -            | URL or path to the LoRA model file.                        |
| `loras[].scale`         | number  | Yes      | -            | Scale factor for the LoRA influence (typically 0-1).       |
| `size`                  | string  | No       | "1024\*1024" | Image dimensions.                                          |
| `seed`                  | integer | No       | -1           | Seed for reproducible results. -1 generates a random seed. |
| `enable_safety_checker` | boolean | No       | true         | Enable content safety checking.                            |

### Seedream 3.0

Seedream 3.0 is a native high-resolution bilingual image generation model supporting both Chinese and English prompts.

```json theme={null}
{
  "input": {
    "prompt": "Hyper-realistic photograph of a deep-sea diver",
    "seed": -1,
    "guidance": 2,
    "size": "1024x1024"
  }
}
```

| Parameter  | Type    | Required | Default     | Description                                                |
| ---------- | ------- | -------- | ----------- | ---------------------------------------------------------- |
| `prompt`   | string  | Yes      | -           | Text description of the desired image.                     |
| `seed`     | integer | No       | -1          | Seed for reproducible results. -1 generates a random seed. |
| `guidance` | number  | No       | 2           | Guidance scale for generation control.                     |
| `size`     | string  | No       | "1024x1024" | Image dimensions.                                          |

### Seedream 4.0 T2I

Seedream 4.0 is a new-generation image creation model that integrates both generation and editing capabilities.

```json theme={null}
{
  "input": {
    "prompt": "American retro 1950s illustration style",
    "negative_prompt": "",
    "size": "2048*2048",
    "seed": -1,
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default      | Description                                                |
| ----------------------- | ------- | -------- | ------------ | ---------------------------------------------------------- |
| `prompt`                | string  | Yes      | -            | Text description of the desired image.                     |
| `negative_prompt`       | string  | No       | -            | Elements to exclude from the image.                        |
| `size`                  | string  | No       | "1024\*1024" | Image dimensions.                                          |
| `seed`                  | integer | No       | -1           | Seed for reproducible results. -1 generates a random seed. |
| `enable_safety_checker` | boolean | No       | true         | Enable content safety checking.                            |

### Nano Banana Edit

Google's Nano Banana Edit is a state-of-the-art image editing model that combines multiple source images.

```json theme={null}
{
  "input": {
    "prompt": "Combine these four source images into a single realistic 3D character figure scene",
    "images": [
      "https://image.runpod.ai/uploads/0bz_xzhuLq/a2166199-5bd5-496b-b9ab-a8bae3f73bdc.jpg",
      "https://image.runpod.ai/uploads/Yw86rhY6xi/2ff8435f-f416-4096-9a4d-2f8c838b2d53.jpg"
    ],
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default | Description                                                 |
| ----------------------- | ------- | -------- | ------- | ----------------------------------------------------------- |
| `prompt`                | string  | Yes      | -       | Editing instructions describing the desired transformation. |
| `images`                | array   | Yes      | -       | Array of image URLs to edit or combine.                     |
| `enable_safety_checker` | boolean | No       | true    | Enable content safety checking.                             |

### Qwen Image Edit

Qwen Image Edit extends the text rendering capabilities to image editing tasks, enabling precise text editing.

```json theme={null}
{
  "input": {
    "prompt": "change the trench coat and high heels color to light grey",
    "negative_prompt": "",
    "seed": -1,
    "image": "https://image.runpod.ai/asset/qwen/qwen-image-edit.png",
    "output_format": "png",
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default | Description                                                |
| ----------------------- | ------- | -------- | ------- | ---------------------------------------------------------- |
| `prompt`                | string  | Yes      | -       | Editing instructions describing the desired changes.       |
| `negative_prompt`       | string  | No       | -       | Elements to exclude from the edited image.                 |
| `seed`                  | integer | No       | -1      | Seed for reproducible results. -1 generates a random seed. |
| `image`                 | string  | Yes      | -       | URL of the image to edit.                                  |
| `output_format`         | string  | No       | "jpeg"  | Output format ("png" or "jpeg").                           |
| `enable_safety_checker` | boolean | No       | true    | Enable content safety checking.                            |

### Seedream 4.0 Edit

Seedream 4.0 Edit provides advanced image editing capabilities with the same unified architecture as Seedream 4.0 T2I.

```json theme={null}
{
  "input": {
    "prompt": "Dress the model in the clothes and hat",
    "images": [
      "https://image.runpod.ai/uploads/WiTaxr1AYF/2c15cbc9-9b03-4d59-bd60-ff3fa024b145.jpg"
    ],
    "size": "1024*1024",
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default      | Description                                                 |
| ----------------------- | ------- | -------- | ------------ | ----------------------------------------------------------- |
| `prompt`                | string  | Yes      | -            | Editing instructions describing the desired transformation. |
| `images`                | array   | Yes      | -            | Array of image URLs to edit or combine.                     |
| `size`                  | string  | No       | "1024\*1024" | Output image dimensions.                                    |
| `enable_safety_checker` | boolean | No       | true         | Enable content safety checking.                             |

### InfiniteTalk

InfiniteTalk is an audio-driven video generation model that creates talking or singing videos from a single image and audio input.

```json theme={null}
{
  "input": {
    "prompt": "a cartoon computer talking",
    "image": "https://image.runpod.ai/assets/meigen-ai/poddy.jpg",
    "audio": "https://image.runpod.ai/assets/meigen-ai/audio.wav",
    "size": "480p",
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default | Description                                                   |
| ----------------------- | ------- | -------- | ------- | ------------------------------------------------------------- |
| `prompt`                | string  | Yes      | -       | Text description of the desired video.                        |
| `image`                 | string  | Yes      | -       | URL of the source image to animate.                           |
| `audio`                 | string  | Yes      | -       | URL of the audio file to drive the animation.                 |
| `size`                  | enum    | Yes      | "480p"  | Output video resolution. Valid options are `480p` and `720p`. |
| `enable_safety_checker` | boolean | No       | true    | Enable content safety checking.                               |

### Kling v2.1 I2V Pro

Kling 2.1 Pro generates videos from static images with additional control parameters.

```json theme={null}
{
  "input": {
    "prompt": "A majestic magic dragon breathing fire over an ancient castle",
    "image": "https://image.runpod.ai/asset/kwaivgi/kling-v2-1-i2v-pro.png",
    "negative_prompt": "",
    "guidance_scale": 0.5,
    "duration": 5,
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default | Description                            |
| ----------------------- | ------- | -------- | ------- | -------------------------------------- |
| `prompt`                | string  | Yes      | -       | Text description of the desired video. |
| `image`                 | string  | Yes      | -       | URL of the source image to animate.    |
| `negative_prompt`       | string  | No       | -       | Elements to exclude from the video.    |
| `guidance_scale`        | float   | No       | 0.5     | How closely to follow the prompt.      |
| `duration`              | integer | No       | 5       | Video duration in seconds.             |
| `enable_safety_checker` | boolean | No       | true    | Enable content safety checking.        |

### Seedance 1.0 Pro

Seedance 1.0 Pro is a high-performance video generation model with multi-shot storytelling capabilities.

```json theme={null}
{
  "input": {
    "prompt": "A pristine Porsche 911 930 Turbo photographed in golden hour lighting",
    "duration": 5,
    "fps": 24,
    "size": "1920x1080",
    "image": ""
  }
}
```

| Parameter  | Type    | Required | Default     | Description                                              |
| ---------- | ------- | -------- | ----------- | -------------------------------------------------------- |
| `prompt`   | string  | Yes      | -           | Text description of the desired video scene.             |
| `duration` | integer | No       | 5           | Video duration in seconds.                               |
| `fps`      | integer | No       | 24          | Frames per second for the output video.                  |
| `size`     | string  | No       | "1920x1080" | Video dimensions.                                        |
| `image`    | string  | No       | ""          | Optional source image URL for image-to-video generation. |

### SORA 2 I2V

OpenAI's Sora 2 is a video and audio generation model.

```json theme={null}
{
  "input": {
    "prompt": "Action: The mechs headlamps flicker a few times. It then slowly and laboriously pushes itself up with a damaged mechanical arm, sparks flying from gaps in its armor. Ambient Sound: Distant, continuous explosions (low rumbles); the sizzle and crackle of short-circuiting electricity from within the mech; heavy, grinding metallic sounds as the mech rises; faint, garbled static from a damaged comms unit.Character Dialogue: (Processed mechanical voice, weary but firm) `No retreat.`",
    "image": "https://image.runpod.ai/assets/sora-2-i2v/example.jpeg",
    "duration": 4
  }
}
```

| Parameter  | Type    | Required | Default | Description                                                                                     |
| ---------- | ------- | -------- | ------- | ----------------------------------------------------------------------------------------------- |
| `prompt`   | string  | Yes      | -       | Text description of the desired video, including action, ambient sound, and character dialogue. |
| `image`    | string  | Yes      | -       | URL of the source image to animate.                                                             |
| `duration` | integer | Yes      | 4       | Video duration in seconds. Valid options: 4, 8, or 12.                                          |

### SORA 2 Pro I2V

OpenAI's Sora 2 Pro is a professional-grade video and audio generation model.

```json theme={null}
{
  "input": {
    "prompt": "Action: She opened her hands\n\nAmbient Sound: The soft crackling of the dying fire in the oven; a high-pitched, happy little ding sound from the timer; the warm, persistent sizzle of butter melting on a nearby stovetop.\n\nCharacter Dialogue: (Voice is high-pitched, bubbly, and enthusiastic) \"Welcome to my bakery!\"\n\n\n",
    "image": "https://image.runpod.ai/assets/sora-2-pro-i2v/example.jpeg",
    "size": "720p",
    "duration": 4
  }
}
```

| Parameter  | Type    | Required | Default | Description                                                                                     |
| ---------- | ------- | -------- | ------- | ----------------------------------------------------------------------------------------------- |
| `prompt`   | string  | Yes      | -       | Text description of the desired video, including action, ambient sound, and character dialogue. |
| `image`    | string  | Yes      | -       | URL of the source image to animate.                                                             |
| `size`     | string  | No       | "720p"  | Output video resolution.                                                                        |
| `duration` | integer | Yes      | 4       | Video duration in seconds. Valid options: 4, 8, or 12.                                          |

### Whisper V3 Large

Whisper V3 Large is a state-of-the-art automatic speech recognition model that transcribes audio to text.

```json theme={null}
{
  "input": {
    "prompt": "",
    "audio": "https://d1q70pf5vjeyhc.cloudfront.net/predictions/f981a3dca8204b14ab24151fa0532c26/1.mp3"
  }
}
```

| Parameter | Type   | Required | Default | Description                                        |
| --------- | ------ | -------- | ------- | -------------------------------------------------- |
| `prompt`  | string | No       | ""      | Optional context or prompt to guide transcription. |
| `audio`   | string | Yes      | -       | URL of the audio file to transcribe.               |

### Minimax Speech 02 HD

Minimax Speech 02 HD is a high-definition text-to-speech model with emotional control and voice customization.

```json theme={null}
{
  "input": {
    "prompt": "Welcome to our advanced text-to-speech system",
    "voice_id": "Wise_Woman",
    "speed": 1,
    "volume": 1,
    "pitch": 0,
    "emotion": "happy",
    "english_normalization": false,
    "default_audio_url": "https://d1q70pf5vjeyhc.cloudfront.net/predictions/f981a3dca8204b14ab24151fa0532c26/1.mp3"
  }
}
```

| Parameter               | Type    | Required | Default       | Description                               |
| ----------------------- | ------- | -------- | ------------- | ----------------------------------------- |
| `prompt`                | string  | Yes      | -             | Text to convert to speech.                |
| `voice_id`              | string  | No       | "Wise\_Woman" | Voice identifier for the desired voice.   |
| `speed`                 | number  | No       | 1             | Speech speed multiplier.                  |
| `volume`                | number  | No       | 1             | Volume level.                             |
| `pitch`                 | number  | No       | 0             | Pitch adjustment.                         |
| `emotion`               | string  | No       | "neutral"     | Emotion to convey (e.g., "happy", "sad"). |
| `english_normalization` | boolean | No       | false         | Enable English text normalization.        |
| `default_audio_url`     | string  | No       | ""            | Fallback audio URL.                       |

### Flux Kontext Dev

A 12 billion parameter model for editing images based on text instructions.

```json theme={null}
{
  "input": {
    "prompt": "Exact same bluebird, same angle and posture (wings folded, facing right), now perched on a small chunk of cloud floating in deep outer space",
    "negative_prompt": "",
    "seed": -1,
    "num_inference_steps": 28,
    "guidance": 2,
    "image": "https://image.runpod.ai/asset/black-forest-labs/black-forest-labs-flux-1-kontext-dev.png",
    "size": "1024*1024",
    "output_format": "png",
    "enable_safety_checker": true
  }
}
```

| Parameter               | Type    | Required | Default      | Range           | Description                                                                                  |
| ----------------------- | ------- | -------- | ------------ | --------------- | -------------------------------------------------------------------------------------------- |
| `prompt`                | string  | Yes      | -            | -               | Text instructions describing the desired edits to the image.                                 |
| `negative_prompt`       | string  | No       | ""           | -               | Elements to exclude from the edited image.                                                   |
| `image`                 | string  | Yes      | -            | -               | URL of the input image to edit.                                                              |
| `size`                  | string  | No       | "1024\*1024" | -               | Output image size in format "width\*height".                                                 |
| `num_inference_steps`   | integer | No       | 28           | 1-50            | Number of denoising steps.                                                                   |
| `guidance`              | float   | No       | 2            | 0.0-10.0        | How closely to follow the prompt.                                                            |
| `seed`                  | integer | No       | -1           | -               | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
| `output_format`         | string  | No       | "png"        | "png" or "jpeg" | Output image format.                                                                         |
| `enable_safety_checker` | boolean | No       | true         | -               | Whether to run safety checks on the output.                                                  |

### WAN 2.5

WAN 2.5 generates videos from static images.

```json theme={null}
{
  "input": {
    "prompt": "A stand-up comedian delivering a dad joke",
    "image": "https://image.runpod.ai/uploads/fccSIh7CTx/5abfc82d-44f4-4318-9518-7fdba0b285d9.png",
    "negative_prompt": "",
    "size": "1280*720",
    "duration": 5,
    "seed": -1,
    "enable_prompt_expansion": false,
    "enable_safety_checker": true
  }
}
```

| Parameter                 | Type    | Required | Default     | Description                                                |
| ------------------------- | ------- | -------- | ----------- | ---------------------------------------------------------- |
| `prompt`                  | string  | Yes      | -           | Text description of the desired video.                     |
| `image`                   | string  | Yes      | -           | URL of the source image to animate.                        |
| `negative_prompt`         | string  | No       | -           | Elements to exclude from the video.                        |
| `size`                    | string  | No       | "1280\*720" | Video dimensions.                                          |
| `duration`                | integer | No       | 5           | Video duration in seconds.                                 |
| `seed`                    | integer | No       | -1          | Seed for reproducible results. -1 generates a random seed. |
| `enable_prompt_expansion` | boolean | No       | false       | Automatically expand and enhance the prompt.               |
| `enable_safety_checker`   | boolean | No       | true        | Enable content safety checking.                            |

### Wan 2.2 I2V 720p LoRA

Wan 2.2 is an open-source video generation model with LoRA support for customized camera movements and effects.

```json theme={null}
{
  "input": {
    "prompt": "orbit 180 around an astronaut on the moon",
    "image": "https://image.runpod.ai/asset/alibaba/wan-2-2-t2v-720-lora.png",
    "high_noise_loras": [
      {
        "path": "https://huggingface.co/ostris/wan22_i2v_14b_orbit_shot_lora/resolve/main/wan22_14b_i2v_orbit_high_noise.safetensors",
        "scale": 1
      }
    ],
    "low_noise_loras": [
      {
        "path": "https://huggingface.co/ostris/wan22_i2v_14b_orbit_shot_lora/resolve/main/wan22_14b_i2v_orbit_low_noise.safetensors",
        "scale": 1
      }
    ],
    "duration": 5,
    "seed": -1,
    "enable_safety_checker": true
  }
}
```

| Parameter                  | Type    | Required | Default | Description                                                |
| -------------------------- | ------- | -------- | ------- | ---------------------------------------------------------- |
| `prompt`                   | string  | Yes      | -       | Text description of the desired video motion.              |
| `image`                    | string  | Yes      | -       | URL of the source image to animate.                        |
| `high_noise_loras`         | array   | No       | \[]     | LoRA configurations for high-noise stages.                 |
| `high_noise_loras[].path`  | string  | Yes      | -       | URL or path to the LoRA model file.                        |
| `high_noise_loras[].scale` | number  | Yes      | -       | Scale factor for the LoRA influence.                       |
| `low_noise_loras`          | array   | No       | \[]     | LoRA configurations for low-noise stages.                  |
| `low_noise_loras[].path`   | string  | Yes      | -       | URL or path to the LoRA model file.                        |
| `low_noise_loras[].scale`  | number  | Yes      | -       | Scale factor for the LoRA influence.                       |
| `duration`                 | integer | No       | 5       | Video duration in seconds.                                 |
| `seed`                     | integer | No       | -1      | Seed for reproducible results. -1 generates a random seed. |
| `enable_safety_checker`    | boolean | No       | true    | Enable content safety checking.                            |

### Wan 2.2 I2V 720p

An open-source image-to-video generation model that creates 720p video content from static images.

```json theme={null}
{
  "input": {
    "prompt": "cinematic shot: slow-tracking camera glides parallel to a giant white origami boat as it gently drifts down a jade-green river",
    "image": "https://image.runpod.ai/asset/alibaba/wan-2-2-i2v-720.png",
    "num_inference_steps": 30,
    "guidance": 5,
    "negative_prompt": "",
    "size": "1280*720",
    "duration": 5,
    "flow_shift": 5,
    "seed": -1,
    "enable_prompt_optimization": false,
    "enable_safety_checker": true
  }
}
```

| Parameter                    | Type    | Required | Default     | Range    | Description                                                                                  |
| ---------------------------- | ------- | -------- | ----------- | -------- | -------------------------------------------------------------------------------------------- |
| `prompt`                     | string  | Yes      | -           | -        | Text description of the desired video motion and content.                                    |
| `image`                      | string  | Yes      | -           | -        | URL of the input image to animate.                                                           |
| `negative_prompt`            | string  | No       | ""          | -        | Elements to exclude from the generated video.                                                |
| `size`                       | string  | No       | "1280\*720" | -        | Video resolution in format "width\*height".                                                  |
| `num_inference_steps`        | integer | No       | 30          | 1-50     | Number of denoising steps.                                                                   |
| `guidance`                   | float   | No       | 5           | 0.0-10.0 | How closely to follow the prompt.                                                            |
| `duration`                   | integer | No       | 5           | -        | Video duration in seconds.                                                                   |
| `flow_shift`                 | integer | No       | 5           | -        | Controls the motion flow in the generated video.                                             |
| `seed`                       | integer | No       | -1          | -        | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
| `enable_prompt_optimization` | boolean | No       | false       | -        | Whether to automatically optimize the prompt.                                                |
| `enable_safety_checker`      | boolean | No       | true        | -        | Whether to run safety checks on the output.                                                  |

### Wan 2.2 T2V 720p

Open-source model for generating 720p videos from text prompts.

```json theme={null}
{
  "input": {
    "prompt": "A serene morning in an ancient forest, golden sunlight filtering through tall pine trees, creating dancing light patterns on the moss-covered ground",
    "num_inference_steps": 30,
    "guidance": 5,
    "negative_prompt": "",
    "size": "1280*720",
    "duration": 5,
    "flow_shift": 5,
    "seed": -1,
    "enable_prompt_optimization": false,
    "enable_safety_checker": true
  }
}
```

| Parameter                    | Type    | Required | Default     | Range    | Description                                                                                  |
| ---------------------------- | ------- | -------- | ----------- | -------- | -------------------------------------------------------------------------------------------- |
| `prompt`                     | string  | Yes      | -           | -        | Text description of the desired video content.                                               |
| `negative_prompt`            | string  | No       | ""          | -        | Elements to exclude from the generated video.                                                |
| `size`                       | string  | No       | "1280\*720" | -        | Video resolution in format "width\*height".                                                  |
| `num_inference_steps`        | integer | No       | 30          | 1-50     | Number of denoising steps.                                                                   |
| `guidance`                   | float   | No       | 5           | 0.0-10.0 | How closely to follow the prompt.                                                            |
| `duration`                   | integer | No       | 5           | -        | Video duration in seconds.                                                                   |
| `flow_shift`                 | integer | No       | 5           | -        | Controls the motion flow in the generated video.                                             |
| `seed`                       | integer | No       | -1          | -        | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
| `enable_prompt_optimization` | boolean | No       | false       | -        | Whether to automatically optimize the prompt.                                                |
| `enable_safety_checker`      | boolean | No       | true        | -        | Whether to run safety checks on the output.                                                  |

### Wan 2.1 I2V 720p

Open-source image-to-video generation model that converts static images into 720p videos.

```json theme={null}
{
  "input": {
    "prompt": "The family of three just took a selfie. They lean in together, smiling and relaxed. The daughter holds the phone and shows the screen",
    "image": "https://image.runpod.ai/asset/alibaba/wan-2-1-i2v-720.png",
    "num_inference_steps": 30,
    "guidance": 5,
    "negative_prompt": "",
    "size": "1280*720",
    "duration": 5,
    "flow_shift": 5,
    "seed": -1,
    "enable_prompt_optimization": false,
    "enable_safety_checker": true
  }
}
```

| Parameter                    | Type    | Required | Default     | Range    | Description                                                                                  |
| ---------------------------- | ------- | -------- | ----------- | -------- | -------------------------------------------------------------------------------------------- |
| `prompt`                     | string  | Yes      | -           | -        | Text description of the desired video motion and content.                                    |
| `image`                      | string  | Yes      | -           | -        | URL of the input image to animate.                                                           |
| `negative_prompt`            | string  | No       | ""          | -        | Elements to exclude from the generated video.                                                |
| `size`                       | string  | No       | "1280\*720" | -        | Video resolution in format "width\*height".                                                  |
| `num_inference_steps`        | integer | No       | 30          | 1-50     | Number of denoising steps.                                                                   |
| `guidance`                   | float   | No       | 5           | 0.0-10.0 | How closely to follow the prompt.                                                            |
| `duration`                   | integer | No       | 5           | -        | Video duration in seconds.                                                                   |
| `flow_shift`                 | integer | No       | 5           | -        | Controls the motion flow in the generated video.                                             |
| `seed`                       | integer | No       | -1          | -        | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
| `enable_prompt_optimization` | boolean | No       | false       | -        | Whether to automatically optimize the prompt.                                                |
| `enable_safety_checker`      | boolean | No       | true        | -        | Whether to run safety checks on the output.                                                  |

### Wan 2.1 T2V 720p

An open-source video generation model for creating 720p videos from text prompts.

```json theme={null}
{
  "input": {
    "prompt": "Steady rain falls on a bustling Tokyo street at night, neon signs casting vibrant pink and blue light that reflects and ripples across the wet black pavement",
    "num_inference_steps": 30,
    "guidance": 5,
    "negative_prompt": "",
    "size": "1280*720",
    "duration": 5,
    "flow_shift": 5,
    "seed": -1,
    "enable_prompt_optimization": false,
    "enable_safety_checker": true
  }
}
```

| Parameter                    | Type    | Required | Default     | Range    | Description                                                                                  |
| ---------------------------- | ------- | -------- | ----------- | -------- | -------------------------------------------------------------------------------------------- |
| `prompt`                     | string  | Yes      | -           | -        | Text description of the desired video content.                                               |
| `negative_prompt`            | string  | No       | ""          | -        | Elements to exclude from the generated video.                                                |
| `size`                       | string  | No       | "1280\*720" | -        | Video resolution in format "width\*height".                                                  |
| `num_inference_steps`        | integer | No       | 30          | 1-50     | Number of denoising steps.                                                                   |
| `guidance`                   | float   | No       | 5           | 0.0-10.0 | How closely to follow the prompt.                                                            |
| `duration`                   | integer | No       | 5           | -        | Video duration in seconds.                                                                   |
| `flow_shift`                 | integer | No       | 5           | -        | Controls the motion flow in the generated video.                                             |
| `seed`                       | integer | No       | -1          | -        | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
| `enable_prompt_optimization` | boolean | No       | false       | -        | Whether to automatically optimize the prompt.                                                |
| `enable_safety_checker`      | boolean | No       | true        | -        | Whether to run safety checks on the output.                                                  |