# Sogni LLM API

OpenAI-compatible chat completions API with built-in Sogni Supernet tool integrations for AI image, video, and music generation.

**Base URL:** `https://api.sogni.ai`
**Staging URL:** `https://api-staging.sogni.ai`

---

## Authentication

All endpoints require authentication via one of:

| Method | Header | Format |
|--------|--------|--------|
| API Key (recommended) | `Authorization` | `Bearer <api-key>` |
| API Key (legacy) | `api-key` | `<api-key>` |
| JWT | `Authorization` | `Bearer <jwt-token>` |

```bash
# API key as Bearer token (recommended)
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY"

# Legacy header
curl https://api.sogni.ai/v1/chat/completions \
  -H "api-key: YOUR_API_KEY"
```

---

## Rate Limits

- **60 requests per 60 seconds** per authenticated user.
- Returns HTTP `429` with an OpenAI-compatible error when exceeded.

---

## Endpoints

### POST /v1/chat/completions

Create a chat completion. Fully compatible with the OpenAI Chat Completions API format.

#### Request Body

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `messages` | `array` | **Yes** | | Array of message objects (see [Message Format](#message-format)) |
| `model` | `string` | No | `qwen3.6-35b-a3b-gguf-iq4xs` | Model ID. Use `/v1/models` to list available models. |
| `stream` | `boolean` | No | `false` | Stream response as Server-Sent Events (SSE) |
| `max_tokens` | `integer` | No | `32768*` | Maximum tokens to generate. Clamped to the active model tier. |
| `temperature` | `number` | No | `0.7*` | Sampling temperature. Clamped to the active model tier. |
| `top_p` | `number` | No | `0.8*` | Nucleus sampling threshold. Clamped to the active model tier. |
| `top_k` | `integer` | No | `20*` | Top-k sampling cutoff. Clamped to the active model tier. |
| `min_p` | `number` | No | `0*` | Minimum probability filter. Clamped to the active model tier. |
| `repetition_penalty` | `number` | No | `1.0*` | Repetition penalty. Clamped to the active model tier. |
| `stop` | `string \| string[]` | No | | Stop sequence(s) |
| `frequency_penalty` | `number` | No | `0` | Frequency penalty (-2 to 2) |
| `presence_penalty` | `number` | No | `1.5*` | Presence penalty. Clamped to the active model tier. |
| `tools` | `array` | No | | Additional tool definitions (merged with Sogni tools) |
| `tool_choice` | `string \| object` | No | `auto` (when tools present) | `"auto"`, `"none"`, or `{"type":"function","function":{"name":"..."}}` |
| `sogni_tools` | `boolean` | No | `true` | Auto-inject Sogni tools (image/video/music generation). Set `false` to disable. |
| `sogni_tool_execution` | `boolean` | No | `true` (API key auth) | Auto-execute Sogni tool calls server-side. Set `false` to receive raw `tool_calls`. |
| `task_profile` | `string` | No | | Task routing hint: `"general"`, `"coding"`, or `"reasoning"` |
| `token_type` | `string` | No | `auto` | Billing token: `"spark"`, `"sogni"`, or `"auto"` (spark with sogni fallback) |
| `chat_template_kwargs` | `object` | No | | Per-request template/runtime flags. `{"enable_thinking": true}` enables thinking-capable Qwen3/Hermes behavior. |

The `token_type` field can also be set via the `X-Token-Type` HTTP header. The body field takes precedence over the header.

`*` The defaults above reflect the current default model, `qwen3.6-35b-a3b-gguf-iq4xs`. Explicit values are clamped to the active model tier.

#### Current Qwen3.6 Defaults

If you omit sampling fields, the server applies the current Qwen3.6 tier defaults:

- Default non-thinking preset: `max_tokens=32768`, `temperature=0.7`, `top_p=0.8`, `top_k=20`, `min_p=0`, `repetition_penalty=1.0`, `presence_penalty=1.5`
- Non-thinking + `task_profile: "reasoning"`: `temperature=1.0`, `top_p=1.0`, `top_k=40`, `min_p=0`, `repetition_penalty=1.0`, `presence_penalty=2.0`
- Thinking (`chat_template_kwargs.enable_thinking: true`): `temperature=1.0`, `top_p=0.95`, `top_k=20`, `min_p=0`, `repetition_penalty=1.0`, `presence_penalty=1.5`
- Thinking + `task_profile: "coding"`: `temperature=0.6`, `top_p=0.95`, `top_k=20`, `min_p=0`, `repetition_penalty=1.0`, `presence_penalty=0.0`
- When thinking is enabled and `task_profile` is `coding` or `reasoning`, omitted `max_tokens` defaults to `81920` instead of `32768`
- Current Qwen3.6 tier limits: `max_tokens 1-131072`, `temperature 0-2`, `top_p 0-1`, `top_k 1-128`, `min_p 0-1`, `repetition_penalty 0-2`, `frequency_penalty -2 to 2`, `presence_penalty -2 to 2`

#### Message Format

Each message in the `messages` array has the following structure:

```json
{
  "role": "developer" | "system" | "user" | "assistant" | "tool",
  "content": "string or array of content parts",
  "tool_calls": [],
  "tool_call_id": "string"
}
```

**Content** can be:

- A plain string: `"Hello, how are you?"`
- An array of content parts (user messages only) for multimodal/vision input:

```json
[
  { "type": "text", "text": "What's in this image?" },
  { "type": "image_url", "image_url": { "url": "data:image/jpeg;base64,...", "detail": "auto" } }
]
```

The `detail` field on `image_url` is optional and accepts `"auto"`, `"low"`, or `"high"`.
`image_url.url` must be an inline base64-encoded JPEG or PNG `data:` URI. Remote `http(s)` URLs are not allowed. A maximum of 20 vision images is allowed per request, each image must be 10MB or smaller, and each image's longest side must be 1024px or less. This 1024px dimension cap applies only to the vision `image_url` path, not to media-generation tool image inputs.

**Compatibility notes:**
- The `"developer"` role (OpenAI o1+ format) is automatically mapped to `"system"`.
- When a `"developer"` message is present and `task_profile` is omitted, the server defaults it to `"coding"`.
- The legacy model ID `"qwen3.5-35b-a3b-gguf-q4km"` is automatically rewritten to `"qwen3.6-35b-a3b-gguf-iq4xs"`.
- `max_completion_tokens` (OpenAI SDK v1.26+) is automatically mapped to `max_tokens`.
- `temperature`, `top_p`, `top_k`, `min_p`, `repetition_penalty`, `frequency_penalty`, and `presence_penalty` are clamped to the selected model's tier constraints.
- Array content in non-user messages is flattened to a joined text string.

#### Response (non-streaming)

```json
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1773353812,
  "model": "qwen3.6-35b-a3b-gguf-iq4xs",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 10,
    "total_tokens": 25
  }
}
```

#### Response (streaming)

When `stream: true`, the response is sent as Server-Sent Events (SSE):

```
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1773353812,"model":"qwen3.6-35b-a3b-gguf-iq4xs","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1773353812,"model":"qwen3.6-35b-a3b-gguf-iq4xs","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","created":1773353812,"model":"qwen3.6-35b-a3b-gguf-iq4xs","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
```

Internal `<think>...</think>` reasoning blocks are automatically stripped from both streaming and non-streaming responses.

---

### GET /v1/models

List all available models.

#### Response

```json
{
  "object": "list",
  "data": [
    {
      "id": "qwen3.6-35b-a3b-gguf-iq4xs",
      "object": "model",
      "created": 1776384000,
      "owned_by": "qwen"
    }
  ]
}
```

---

### GET /v1/models/:model_id

Get details for a specific model.

#### Response

```json
{
  "id": "qwen3.6-35b-a3b-gguf-iq4xs",
  "object": "model",
  "created": 1776384000,
  "owned_by": "qwen"
}
```

Returns `404` if the model is not found.

---

## Sogni Tools (Auto-Injected)

By default, every chat completion request is augmented with six Sogni Supernet tools for AI media generation. The LLM decides when to call these tools based on the user's message. Set `sogni_tools: false` to disable auto-injection.

Current auto-injected tools:

- `sogni_generate_image`
- `sogni_edit_image`
- `sogni_generate_video`
- `sogni_sound_to_video`
- `sogni_video_to_video`
- `sogni_generate_music`

### How Tool Calling Works

Tool calling follows the standard OpenAI function calling protocol:

```
1. Client sends message     →  "Generate an image of a sunset"
2. API injects Sogni tools  →  sogni_generate_image, sogni_edit_image, sogni_generate_video, sogni_sound_to_video, sogni_video_to_video, sogni_generate_music
3. LLM returns tool_calls   →  { name: "sogni_generate_image", arguments: { prompt: "..." } }
4. API or client executes tool →  Server-side by default for API-key auth, or client-side in manual mode
5. Tool result is fed back   →  By the API in auto mode, or by the client in manual mode
6. LLM gives final answer   →  "Here's your sunset image: ..."
```

With API-key authentication, the API auto-executes Sogni tool calls by default and runs the follow-up LLM rounds server-side. Set `sogni_tool_execution: false` to receive raw `tool_calls` and handle them manually. JWT-authenticated requests do not have API-key-backed server-side tool execution, so manual tool handling still applies there. The [Sogni Client SDK](https://github.com/Sogni-AI/sogni-client) (`sogni-client`) also provides built-in helpers for manual/client-side execution.

### Injected Tool Schemas

Below is a representative subset of the `tools` payload the server appends to every request (unless `sogni_tools: false`). The source of truth is [`src/data/sogni-tools.json`](../src/data/sogni-tools.json). This is useful for debugging tool selection, reproducing calls deterministically, or building custom tool routers. The excerpt below shows the legacy core tools; the full payload also includes `sogni_edit_image`, `sogni_sound_to_video`, and `sogni_video_to_video`.

```json
[
  {
    "type": "function",
    "function": {
      "name": "sogni_generate_image",
      "description": "Generate an image using AI image generation on the Sogni Supernet. Returns a URL to the generated image. Use this tool EVERY TIME the user asks to create, generate, draw, or make an image or picture. Do NOT generate URLs yourself — you MUST call this tool.",
      "parameters": {
        "type": "object",
        "properties": {
          "prompt": {
            "type": "string",
            "description": "Detailed text description of the image to generate. Be specific about style, composition, lighting, colors, and subject matter."
          },
          "negative_prompt": {
            "type": "string",
            "description": "Things to avoid in the generated image (e.g., \"blurry, low quality, distorted\")."
          },
          "width": {
            "type": "number",
            "description": "Image width in pixels. Must be a multiple of 16. Default: 1024. Max: 2048."
          },
          "height": {
            "type": "number",
            "description": "Image height in pixels. Must be a multiple of 16. Default: 1024. Max: 2048."
          },
          "model": {
            "type": "string",
            "description": "Image generation model to use.",
            "enum": ["flux1-schnell-fp8", "flux2-dev_fp8", "chroma-v.46-flash_fp8", "z_image_turbo_bf16"]
          },
          "steps": {
            "type": "number",
            "description": "Number of inference steps. Higher = better quality but slower. Default depends on model (4-50)."
          },
          "seed": {
            "type": "number",
            "description": "Random seed for reproducible generation. Use -1 for random."
          }
        },
        "required": ["prompt"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "sogni_generate_video",
      "description": "Generate a short video using AI video generation on the Sogni Supernet. Returns a URL to the generated video. Use this tool EVERY TIME the user asks to create, generate, or make a video, clip, or animation. Do NOT generate URLs yourself — you MUST call this tool. Write the prompt as a cohesive mini-scene in present tense, describing motion, camera movement, lighting, and atmosphere in flowing prose.",
      "parameters": {
        "type": "object",
        "properties": {
          "prompt": {
            "type": "string",
            "description": "Detailed text description of the video to generate. Write it as a flowing present-tense scene: describe the subject, action, camera movement, lighting, and atmosphere. Clear camera-to-subject relationship improves motion consistency. Be specific and vivid."
          },
          "negative_prompt": {
            "type": "string",
            "description": "Things to avoid in the generated video (e.g., \"blurry, low quality, distorted, watermark\")."
          },
          "width": {
            "type": "number",
            "description": "Video width in pixels. Default: 1920. Standard resolutions: 1920x1088 (landscape), 1088x1920 (portrait), 1280x720."
          },
          "height": {
            "type": "number",
            "description": "Video height in pixels. Default: 1088. Must be a multiple of 16."
          },
          "duration": {
            "type": "number",
            "description": "Video duration in seconds. Range: 1-20. Default: 5."
          },
          "fps": {
            "type": "number",
            "description": "Frames per second. Default: 24. Range: 1-60."
          },
          "model": {
            "type": "string",
            "description": "Video generation model to use. Prefer LTX-2.3 models: 'ltx23-22b-fp8_t2v_distilled' for text-to-video, 'ltx23-22b-fp8_i2v_distilled' for image-to-video."
          },
          "seed": {
            "type": "number",
            "description": "Random seed for reproducible generation. Use -1 for random."
          }
        },
        "required": ["prompt"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "sogni_generate_music",
      "description": "Generate a music track using AI music generation on the Sogni Supernet. Returns a URL to the generated audio file. Use this tool EVERY TIME the user asks to create, generate, compose, or make music, a song, a beat, or audio. Do NOT generate URLs yourself — you MUST call this tool.",
      "parameters": {
        "type": "object",
        "properties": {
          "prompt": {
            "type": "string",
            "description": "Description of the music to generate. Include genre, mood, tempo, instruments, and style. Can also include lyrics wrapped in [verse], [chorus], etc. tags."
          },
          "duration": {
            "type": "number",
            "description": "Duration of the generated music in seconds. Range: 10-600. Default: 30."
          },
          "bpm": {
            "type": "number",
            "description": "Beats per minute. Range: 30-300. Default: 120."
          },
          "keyscale": {
            "type": "string",
            "description": "Musical key and scale (e.g., \"C major\", \"A minor\", \"F# minor\", \"Bb major\"). Default: \"C major\"."
          },
          "timesignature": {
            "type": "string",
            "description": "Time signature. \"4\" for 4/4, \"3\" for 3/4, \"2\" for 2/4. Default: \"4\".",
            "enum": ["4", "3", "2"]
          },
          "model": {
            "type": "string",
            "description": "Music generation model. \"ace_step_1.5_turbo\" is the default and preferred model — highest quality output. \"ace_step_1.5_sft\" is an experimental model with lower fidelity but best lyric handling support.",
            "enum": ["ace_step_1.5_turbo", "ace_step_1.5_sft"]
          },
          "output_format": {
            "type": "string",
            "description": "Audio output format. Default: \"mp3\".",
            "enum": ["mp3", "flac", "wav"]
          },
          "seed": {
            "type": "number",
            "description": "Random seed for reproducible generation. Use -1 for random."
          }
        },
        "required": ["prompt"]
      }
    }
  }
]
```

If you pass your own `tools` array, the server merges Sogni tools into it — skipping any whose `function.name` already exists in your array (your definitions take precedence). Setting `tool_choice` is left untouched if you provide it; otherwise it defaults to `"auto"` when Sogni tools are injected.

### Tool Call Response Format

When the LLM decides to call a tool, the response has `finish_reason: "tool_calls"` and includes the tool calls in the message:

```json
{
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "sogni_generate_image",
              "arguments": "{\"prompt\":\"A sunset over a mountain lake\",\"width\":1024,\"height\":1024}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}
```

### Sending Tool Results Back

If you set `sogni_tool_execution: false`, or if you are using JWT auth without API-key-backed server-side execution, send tool results back manually to continue the conversation. Include the full message history: the original messages, the assistant's `tool_calls` response, and a `tool` message with the result:

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Generate an image of a sunset"},
      {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "sogni_generate_image",
              "arguments": "{\"prompt\":\"A golden sunset over a mountain lake\",\"width\":1024,\"height\":1024}"
            }
          }
        ]
      },
      {
        "role": "tool",
        "tool_call_id": "call_abc123",
        "content": "{\"status\":\"completed\",\"url\":\"https://cdn.sogni.ai/images/abc123.png\",\"width\":1024,\"height\":1024}"
      }
    ]
  }'
```

The LLM will then respond with a natural language message incorporating the tool result, e.g. *"Here's your sunset image! I generated a golden sunset over a mountain lake..."*

### Multi-Round Tool Calling

The LLM may request multiple tool calls in a single response, or require multiple rounds (e.g., generate an image, then a video). Implement a loop that continues until `finish_reason` is no longer `"tool_calls"`:

```
while finish_reason == "tool_calls":
    execute each tool_call
    append assistant message (with tool_calls) to messages
    append tool result messages to messages
    send new request with updated messages
```

The recommended maximum is **5 rounds** to prevent runaway loops.

---

### sogni_generate_image

Generate an image using AI image generation on the Sogni Supernet.

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | `string` | **Yes** | Detailed text description of the image to generate. Be specific about style, composition, lighting, colors, and subject matter. |
| `negative_prompt` | `string` | No | Things to avoid in the generated image (e.g. "blurry, low quality, distorted"). |
| `width` | `number` | No | Image width in pixels. Must be a multiple of 16. Default: 1024. Max: 2048. |
| `height` | `number` | No | Image height in pixels. Must be a multiple of 16. Default: 1024. Max: 2048. |
| `model` | `string` | No | Image generation model. One of: `flux1-schnell-fp8`, `flux2-dev_fp8`, `chroma-v.46-flash_fp8`, `z_image_turbo_bf16`. |
| `steps` | `number` | No | Number of inference steps. Higher = better quality but slower. Default depends on model (4-50). |
| `seed` | `number` | No | Random seed for reproducible generation. Use -1 for random. |

**Available image models:**

| Model | Description |
|-------|-------------|
| `flux1-schnell-fp8` | FLUX.1 Schnell — fast, general-purpose |
| `flux2-dev_fp8` | FLUX.2 Dev — higher quality, slower |
| `chroma-v.46-flash_fp8` | Chroma Flash — fast with vivid colors |
| `z_image_turbo_bf16` | Turbo — fastest generation |

**Example — trigger image generation:**

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Create a photorealistic image of a red fox sitting in a snowy forest at dawn"}
    ]
  }'
```

**Response:**

```json
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1773353812,
  "model": "qwen3.6-35b-a3b-gguf-iq4xs",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_img_001",
            "type": "function",
            "function": {
              "name": "sogni_generate_image",
              "arguments": "{\"prompt\":\"A photorealistic red fox sitting in a snowy pine forest at dawn, soft golden light filtering through the trees, snowflakes gently falling, detailed fur texture, cinematic composition\",\"width\":1024,\"height\":1024,\"model\":\"flux1-schnell-fp8\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 1559,
    "completion_tokens": 95,
    "total_tokens": 1654
  }
}
```

**Complete round-trip — execute the tool call and get the final response:**

```bash
# Step 1: Parse the tool_call from the response above
# Step 2: Execute the image generation via the Sogni Projects API or SDK
# Step 3: Send the result back to the LLM:

curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Create a photorealistic image of a red fox sitting in a snowy forest at dawn"},
      {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_img_001",
            "type": "function",
            "function": {
              "name": "sogni_generate_image",
              "arguments": "{\"prompt\":\"A photorealistic red fox sitting in a snowy pine forest at dawn, soft golden light filtering through the trees, snowflakes gently falling, detailed fur texture, cinematic composition\",\"width\":1024,\"height\":1024,\"model\":\"flux1-schnell-fp8\"}"
            }
          }
        ]
      },
      {
        "role": "tool",
        "tool_call_id": "call_img_001",
        "content": "{\"status\":\"completed\",\"url\":\"https://cdn.sogni.ai/images/fox-snow.png\",\"width\":1024,\"height\":1024,\"model\":\"flux1-schnell-fp8\",\"seed\":42}"
      }
    ]
  }'
```

The LLM will respond with a natural language message describing the generated image and including the URL.

---

### sogni_edit_image

Generate an edited or reference-guided image using one or more input images on the Sogni Supernet.

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | `string` | **Yes** | Describe the desired edit or new image while clearly stating what should be preserved from the provided reference images. |
| `source_image_url` | `string` | No | Primary image to edit or use as the main identity/composition reference. Supports inline base64-encoded PNG or JPEG `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `reference_image_urls` | `string[]` | No | Additional reference images for identity, pose, clothing, style, or background guidance. Supports inline base64-encoded PNG or JPEG `data:` URIs only; remote `http(s)` URLs are not allowed. Combined with `source_image_url`, up to 6 images total are used. |
| `negative_prompt` | `string` | No | Things to avoid in the edited image. |
| `width` | `number` | No | Output image width in pixels. Must be a multiple of 16. |
| `height` | `number` | No | Output image height in pixels. Must be a multiple of 16. |
| `model` | `string` | No | Edit-capable image model. Current public schema includes `qwen_image_edit_2511_fp8_lightning`, `qwen_image_edit_2511_fp8`, `flux2_dev_fp8`, and `flux1-dev-kontext_fp8_scaled`. |
| `number_of_variations` | `number` | No | Number of edited image variations to generate. Range: 1-16. Default: 1. |
| `seed` | `number` | No | Random seed for reproducible generation. Use `-1` for random. |

**Notes:**

- Use this when the user wants image editing, likeness preservation, or multi-reference generation.
- Image inputs are explicit tool arguments. There is no hidden chat-session image index in the public API.
- Despite the `_url` field names, media-bearing tool inputs must be inline base64-encoded `data:` URIs.
- Tool image inputs accept PNG or JPEG only.
- With `number_of_variations > 1`, the API may return multiple result URLs.

---

### sogni_generate_video

Generate a short video using AI video generation on the Sogni Supernet.

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | `string` | **Yes** | Detailed scene description in flowing present tense. Describe the subject, action, camera movement, lighting, and atmosphere. A clear camera-to-subject relationship improves motion consistency. |
| `negative_prompt` | `string` | No | Things to avoid (e.g. "blurry, low quality, distorted, watermark"). |
| `reference_image_url` | `string` | No | Optional starting image for image-to-video generation. Supports inline base64-encoded PNG or JPEG `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `reference_image_end_url` | `string` | No | Optional ending image for keyframe interpolation. Supports inline base64-encoded PNG or JPEG `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `reference_audio_identity_url` | `string` | No | Optional voice identity clip for LTX-2.3 text-to-video or image-to-video workflows. Supports inline base64-encoded MP3, M4A, or WAV `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `audio_identity_strength` | `number` | No | How strongly to apply `reference_audio_identity_url`. Range: 0-10. |
| `first_frame_strength` | `number` | No | How strictly to match the starting frame when using `reference_image_end_url`. Range: 0-1. |
| `last_frame_strength` | `number` | No | How strictly to match the ending frame when using `reference_image_end_url`. Range: 0-1. |
| `width` | `number` | No | Video width in pixels. Default: 1920. Standard: 1920x1088 (landscape), 1088x1920 (portrait), 1280x720. |
| `height` | `number` | No | Video height in pixels. Default: 1088. Must be a multiple of 16. |
| `duration` | `number` | No | Video duration in seconds. Range: 1-20. Default: 5. |
| `fps` | `number` | No | Frames per second. Range: 1-60. Default: 24. |
| `model` | `string` | No | Video generation model. Prefer LTX-2.3 models: `ltx23-22b-fp8_t2v_distilled` (text-to-video) or `ltx23-22b-fp8_i2v_distilled` (image-to-video). |
| `number_of_variations` | `number` | No | Number of video variations to generate. Range: 1-16. Default: 1. |
| `seed` | `number` | No | Random seed for reproducible generation. Use -1 for random. |

**Prompt writing tips:**
- Write as a cohesive mini-scene in present tense
- Describe motion explicitly: *"the camera slowly pans right as..."*
- Include lighting and atmosphere: *"warm golden hour light, dust particles floating"*
- Keep prompts under ~200 words for best results

**Example — trigger video generation:**

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Generate a 5-second video of ocean waves crashing on a rocky shore at sunset"}
    ]
  }'
```

**Response:**

```json
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1773353900,
  "model": "qwen3.6-35b-a3b-gguf-iq4xs",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_vid_001",
            "type": "function",
            "function": {
              "name": "sogni_generate_video",
              "arguments": "{\"prompt\":\"Powerful ocean waves crash against dark volcanic rocks along a rugged coastline. The camera holds steady at a low angle as white foam erupts upward, catching the warm amber and pink light of a setting sun. Golden hour light paints the mist with warm tones as water cascades back down the rocks. Cinematic, slow motion feel, dramatic natural lighting.\",\"width\":1920,\"height\":1088,\"duration\":5,\"fps\":24}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 1562,
    "completion_tokens": 130,
    "total_tokens": 1692
  }
}
```

**Complete round-trip:**

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Generate a 5-second video of ocean waves crashing on a rocky shore at sunset"},
      {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_vid_001",
            "type": "function",
            "function": {
              "name": "sogni_generate_video",
              "arguments": "{\"prompt\":\"Powerful ocean waves crash against dark volcanic rocks along a rugged coastline...\",\"width\":1920,\"height\":1088,\"duration\":5,\"fps\":24}"
            }
          }
        ]
      },
      {
        "role": "tool",
        "tool_call_id": "call_vid_001",
        "content": "{\"status\":\"completed\",\"url\":\"https://cdn.sogni.ai/videos/waves-sunset.mp4\",\"width\":1920,\"height\":1088,\"duration\":5,\"fps\":24}"
      }
    ]
  }'
```

---

### sogni_sound_to_video

Generate an audio-synchronized video from an explicit input audio clip.

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | `string` | **Yes** | Describe the visuals to generate while letting the supplied audio drive timing and rhythm. |
| `reference_audio_url` | `string` | **Yes** | Audio file used to drive the video. Supports inline base64-encoded MP3, M4A, or WAV `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `reference_image_url` | `string` | No | Optional image to use as the subject or first frame. Supports inline base64-encoded PNG or JPEG `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `audio_start` | `number` | No | Start offset in seconds into the input audio. Default: 0. |
| `duration` | `number` | No | Output video duration in seconds. Range: 1-20. Default: 5. |
| `width` | `number` | No | Output video width in pixels. |
| `height` | `number` | No | Output video height in pixels. |
| `model` | `string` | No | Audio-driven video model. Current public schema includes `ltx23-22b-fp8_ia2v_distilled`, `ltx23-22b-fp8_a2v_distilled`, and `wan_v2.2-14b-fp8_s2v_lightx2v`. |
| `number_of_variations` | `number` | No | Number of video variations to generate. Range: 1-16. Default: 1. |
| `seed` | `number` | No | Random seed for reproducible generation. Use `-1` for random. |

**Notes:**

- Use this for music videos, lip-sync style clips, and audio-reactive visuals.
- This public API expects explicit inline audio/image data URIs. It does not rely on chat-session audio history.

---

### sogni_video_to_video

Transform an existing video using video-to-video or motion-transfer workflows.

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | `string` | **Yes** | Describe the target appearance or transformation in present tense. |
| `reference_video_url` | `string` | **Yes** | Source video to transform. Supports inline base64-encoded MP4 or MOV/QuickTime `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `negative_prompt` | `string` | No | Things to avoid in the generated video. |
| `control_mode` | `string` | No | One of `animate-move`, `animate-replace`, `canny`, `pose`, `depth`, or `detailer`. |
| `reference_image_url` | `string` | No | Optional reference image for animate workflows or pose-guided appearance control. Required for `animate-move` and `animate-replace`. Supports inline base64-encoded PNG or JPEG `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `reference_audio_identity_url` | `string` | No | Optional voice identity clip for LTX-2.3 v2v workflows. Supports inline base64-encoded MP3, M4A, or WAV `data:` URIs only; remote `http(s)` URLs are not allowed. |
| `audio_identity_strength` | `number` | No | How strongly to apply `reference_audio_identity_url`. Range: 0-10. |
| `video_start` | `number` | No | Start offset in seconds into the source video. |
| `duration` | `number` | No | Output video duration in seconds. Range: 1-20. Default: 5. |
| `width` | `number` | No | Output video width in pixels. |
| `height` | `number` | No | Output video height in pixels. |
| `detailer_strength` | `number` | No | Optional detailer LoRA strength for LTX-2.3 control workflows. Range: 0-1. |
| `model` | `string` | No | Video-to-video model. Current public schema includes `ltx23-22b-fp8_v2v_distilled`, `wan_v2.2-14b-fp8_animate-move_lightx2v`, and `wan_v2.2-14b-fp8_animate-replace_lightx2v`. |
| `number_of_variations` | `number` | No | Number of video variations to generate. Range: 1-16. Default: 1. |
| `seed` | `number` | No | Random seed for reproducible generation. Use `-1` for random. |

**Notes:**

- `animate-move` and `animate-replace` are WAN animate workflows.
- `canny`, `pose`, `depth`, and `detailer` are LTX-2.3 v2v ControlNet workflows.
- This public API expects explicit inline video/image/audio data URIs instead of app-local file indices.

---

### sogni_generate_music

Generate a music track or song using AI music generation on the Sogni Supernet.

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | `string` | **Yes** | Description of the music to generate. Include genre, mood, tempo, instruments, and style. |
| `lyrics` | `string` | No | Song lyrics to sing. Omit for instrumental music. |
| `language` | `string` | No | Lyrics language code, such as `en` or `es`. |
| `duration` | `number` | No | Duration in seconds. Range: 10-600. Default: 30. |
| `bpm` | `number` | No | Beats per minute. Range: 30-300. Default: 120. |
| `keyscale` | `string` | No | Musical key and scale (e.g. "C major", "A minor", "F# minor", "Bb major"). Default: "C major". |
| `timesignature` | `string` | No | Time signature: `"4"`, `"3"`, `"2"`, or `"6"`. Default: `"4"`. |
| `composer_mode` | `boolean` | No | Enable AI composer mode for richer arrangements. |
| `prompt_strength` | `number` | No | How closely the model should follow the prompt. Higher values increase prompt adherence. |
| `creativity` | `number` | No | Composition variation / temperature. Higher values are more creative. |
| `model` | `string` | No | Music generation model (see below). |
| `output_format` | `string` | No | Audio output format: `"mp3"` (default), `"flac"`, or `"wav"`. |
| `number_of_variations` | `number` | No | Number of audio variations to generate. Range: 1-16. Default: 1. |
| `seed` | `number` | No | Random seed for reproducible generation. Use -1 for random. |

**Available music models:**

| Model | Description |
|-------|-------------|
| `ace_step_1.5_turbo` | Default and preferred model — highest quality output |
| `ace_step_1.5_sft` | Experimental model with lower fidelity but best lyric handling support |

**Lyrics format:**

Wrap lyrics in song structure tags for models that support them (`ace_step_1.5_sft`):

```
[verse]
Walking through the city lights
Neon signs paint the night

[chorus]
We're alive, we're on fire
Dancing higher and higher

[bridge]
The music carries us away
```

**Example — trigger music generation:**

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Compose a 30-second upbeat jazz track with piano and saxophone, perfect for a coffee shop"}
    ]
  }'
```

**Response:**

```json
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1773354000,
  "model": "qwen3.6-35b-a3b-gguf-iq4xs",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_mus_001",
            "type": "function",
            "function": {
              "name": "sogni_generate_music",
              "arguments": "{\"prompt\":\"Upbeat smooth jazz instrumental, warm piano chords with a mellow saxophone melody, walking bass line, light brushed drums, coffee shop ambiance, feel-good morning vibes\",\"duration\":30,\"bpm\":120,\"keyscale\":\"Bb major\",\"model\":\"ace_step_1.5_turbo\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 1565,
    "completion_tokens": 110,
    "total_tokens": 1675
  }
}
```

**Complete round-trip:**

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Compose a 30-second upbeat jazz track with piano and saxophone"},
      {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_mus_001",
            "type": "function",
            "function": {
              "name": "sogni_generate_music",
              "arguments": "{\"prompt\":\"Upbeat smooth jazz instrumental, warm piano chords with a mellow saxophone melody...\",\"duration\":30,\"bpm\":120,\"keyscale\":\"Bb major\",\"model\":\"ace_step_1.5_turbo\"}"
            }
          }
        ]
      },
      {
        "role": "tool",
        "tool_call_id": "call_mus_001",
        "content": "{\"status\":\"completed\",\"url\":\"https://cdn.sogni.ai/audio/jazz-coffee.mp3\",\"duration\":30,\"format\":\"mp3\"}"
      }
    ]
  }'
```

---

### Disabling Sogni Tools

Pass `sogni_tools: false` to prevent auto-injection. This is useful for plain text conversations or when you want full control over the tools available to the model:

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Explain how diffusion models work"}
    ],
    "sogni_tools": false
  }'
```

### Custom Tools Alongside Sogni Tools

Your custom tool definitions are merged with the auto-injected Sogni tools. If a custom tool has the same name as a Sogni tool, your version takes precedence.

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What is the weather in San Francisco?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'
```

In this case, the model will have 7 tools available: `get_weather` (yours) plus the 6 auto-injected Sogni tools: `sogni_generate_image`, `sogni_edit_image`, `sogni_generate_video`, `sogni_sound_to_video`, `sogni_video_to_video`, and `sogni_generate_music`.

### Forcing a Specific Tool

Use `tool_choice` to force the model to call a specific tool:

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "A cyberpunk city at night with neon signs and flying cars"}
    ],
    "tool_choice": {"type": "function", "function": {"name": "sogni_generate_image"}}
  }'
```

### Sogni Client SDK (Recommended)

For the best developer experience, use the [Sogni Client SDK](https://github.com/Sogni-AI/sogni-client) which provides built-in tool execution, progress tracking, and multi-round tool calling:

**Automatic tool execution (non-streaming):**

```javascript
import { SogniClient } from 'sogni-client';

const sogni = await SogniClient.createInstance({
  appId: 'my-app',
  network: 'fast',
  apiKey: 'YOUR_API_KEY',
});

const result = await sogni.chat.completions.create({
  model: 'qwen3.6-35b-a3b-gguf-iq4xs',
  messages: [
    { role: 'user', content: 'Generate an image of a sunset over mountains' },
  ],
  autoExecuteTools: true,   // Automatically execute Sogni tool calls
  maxToolRounds: 5,         // Max tool calling rounds
  onToolProgress: (toolCall, progress) => {
    console.log(`${toolCall.function.name}: ${progress.status} (${progress.percent}%)`);
  },
});

console.log(result.content);       // Final LLM response with image URL
console.log(result.toolHistory);   // Full tool execution history
```

> **Note:** `autoExecuteTools` is only supported for non-streaming requests. Combining `autoExecuteTools: true` with `stream: true` will throw an error. Use manual tool execution (shown below) for streaming.

**Manual tool execution (streaming):**

```javascript
const stream = await sogni.chat.completions.create({
  model: 'qwen3.6-35b-a3b-gguf-iq4xs',
  messages: [
    { role: 'user', content: 'Create a video of ocean waves at sunset' },
  ],
  stream: true,
});

for await (const chunk of stream) {
  if (chunk.content) process.stdout.write(chunk.content);
}

const result = stream.finalResult;
if (result && result.tool_calls?.length > 0) {
  const results = await sogni.chat.tools.executeAll(result.tool_calls, {
    onToolProgress: (tc, p) => console.log(`${tc.function.name}: ${p.status}`),
  });
  // results contains URLs and metadata for generated media
}
```

---

## Error Format

All errors are returned in OpenAI-compatible format:

```json
{
  "error": {
    "message": "Descriptive error message",
    "type": "error_type",
    "param": null,
    "code": "error_code"
  }
}
```

| HTTP Status | Type | Code | Description |
|-------------|------|------|-------------|
| 400 | `invalid_request_error` | `invalid_request_error` | Validation error (missing messages, etc.) |
| 401 | `authentication_error` | `authentication_error` | Missing or invalid authentication |
| 404 | `invalid_request_error` | `model_not_found` | Requested model does not exist |
| 429 | `rate_limit_error` | `rate_limit_exceeded` | Rate limit exceeded |
| 402 | `insufficient_quota` | `insufficient_quota` | Insufficient Spark/Sogni balance |
| 502 | `server_error` | `server_error` | Stream error |
| 503 | `server_error` | `server_error` | No LLM workers available |

---

## cURL Examples

### Basic chat completion

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Explain quantum computing in one paragraph"}
    ]
  }'
```

### Coding assistant request

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6-35b-a3b-gguf-iq4xs",
    "messages": [
      {"role": "developer", "content": "You are a careful coding assistant."},
      {"role": "user", "content": "Write a TypeScript function that validates an Ethereum address."}
    ],
    "chat_template_kwargs": { "enable_thinking": true }
  }'
```

When a `developer` message is present and `task_profile` is omitted, the API defaults `task_profile` to `coding`.

### With explicit model and parameters

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6-35b-a3b-gguf-iq4xs",
    "messages": [
      {"role": "system", "content": "You are a helpful coding assistant."},
      {"role": "user", "content": "Write a Python function to check if a number is prime"}
    ],
    "max_tokens": 500,
    "temperature": 0.7,
    "top_k": 20,
    "task_profile": "coding",
    "chat_template_kwargs": { "enable_thinking": true }
  }'
```

### Streaming response

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "messages": [
      {"role": "user", "content": "Write a short poem about the ocean"}
    ],
    "stream": true
  }'
```

### Vision / multimodal (image input)

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "What do you see in this image?"},
          {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,...", "detail": "high"}}
        ]
      }
    ],
    "sogni_tools": false
  }'
```

### Multi-turn conversation

```bash
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a creative writing assistant."},
      {"role": "user", "content": "Write the opening line of a mystery novel."},
      {"role": "assistant", "content": "The letter arrived three days after the funeral, written in handwriting that could only belong to the dead man himself."},
      {"role": "user", "content": "Continue the story for two more paragraphs."}
    ]
  }'
```

### Specify billing token type

```bash
# Via request body
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello"}
    ],
    "token_type": "sogni",
    "sogni_tools": false
  }'

# Via header
curl https://api.sogni.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Token-Type: spark" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello"}
    ],
    "sogni_tools": false
  }'
```

### List available models

```bash
curl https://api.sogni.ai/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"
```

### Get a specific model

```bash
curl https://api.sogni.ai/v1/models/qwen3.6-35b-a3b-gguf-iq4xs \
  -H "Authorization: Bearer YOUR_API_KEY"
```

---

## OpenAI SDK Compatibility

The API is compatible with the OpenAI Python and Node.js SDKs. Point the `base_url` to the Sogni API:

Standard OpenAI fields such as `temperature`, `top_p`, `presence_penalty`, and `max_tokens` can be passed normally. Sogni-specific fields such as `top_k`, `min_p`, `repetition_penalty`, `task_profile`, `sogni_tools`, `sogni_tool_execution`, `token_type`, and `chat_template_kwargs` should be sent through `extra_body`.

### Python

```python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.sogni.ai/v1",
)

response = client.chat.completions.create(
    model="qwen3.6-35b-a3b-gguf-iq4xs",
    messages=[
        {"role": "user", "content": "Hello!"}
    ],
    temperature=1.0,
    top_p=0.95,
    extra_body={
        "top_k": 20,
        "task_profile": "reasoning",
        "chat_template_kwargs": {"enable_thinking": True},
    },
)

print(response.choices[0].message.content)
```

### Node.js

```javascript
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://api.sogni.ai/v1',
});

const response = await client.chat.completions.create({
  model: 'qwen3.6-35b-a3b-gguf-iq4xs',
  messages: [
    { role: 'user', content: 'Hello!' },
  ],
  temperature: 1.0,
  top_p: 0.95,
  extra_body: {
    top_k: 20,
    task_profile: 'reasoning',
    chat_template_kwargs: { enable_thinking: true },
  },
});

console.log(response.choices[0].message.content);
```

### Python with streaming

```python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.sogni.ai/v1",
)

stream = client.chat.completions.create(
    model="qwen3.6-35b-a3b-gguf-iq4xs",
    messages=[
        {"role": "user", "content": "Write a haiku about programming"}
    ],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)
```

### Python with vision

```python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.sogni.ai/v1",
)

response = client.chat.completions.create(
    model="qwen3.6-35b-a3b-gguf-iq4xs",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image"},
                {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}},
            ],
        }
    ],
    extra_body={"sogni_tools": False},
)

print(response.choices[0].message.content)
```