doc/ollama-api.md

# Ollama Chat API

Endpoint: `POST /api/chat`

## Request

```json
{
  "model": "llama3.2",
  "messages": [
    { "role": "system",    "content": "string" },
    { "role": "user",      "content": "string" },
    { "role": "assistant", "content": "string" },
    { "role": "tool",      "content": "string" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "string",
        "description": "string",
        "parameters": {
          "type": "object",
          "properties": { "param": { "type": "string", "description": "..." } },
          "required": ["param"]
        }
      }
    }
  ],
  "think":     false,
  "format":    "json",
  "stream":    true,
  "keep_alive": "5m",
  "options": {
    "temperature": 0.7,
    "num_ctx":     131072,
    "num_predict": 4096
  }
}
```

### Fields

| Field        | Type    | Required | Notes                                          |
|--------------|---------|----------|------------------------------------------------|
| `model`      | string  | Yes      | Model name (e.g. `qwen3.6:35b-a3b-q4_K_M`)   |
| `messages`   | array   | Yes      | Conversation history                           |
| `tools`      | array   | No       | Function definitions (OpenAI-compatible format)|
| `think`      | boolean | No       | Enable chain-of-thought (thinking models only) |
| `format`     | string  | No       | `"json"` or JSON schema for structured output  |
| `stream`     | boolean | No       | Default: `true`                                |
| `keep_alive` | string  | No       | How long to keep model loaded. Default: `5m`   |
| `options`    | object  | No       | Model runtime parameters                       |

## Streaming Response (NDJSON)

Each line is a standalone JSON object:

```json
{ "model": "...", "message": { "role": "assistant", "content": "partial text" }, "done": false }
```

Final line (done):
```json
{
  "model": "...",
  "message": { "role": "assistant", "content": "" },
  "done": true,
  "done_reason": "stop",
  "total_duration":      1234567890,
  "load_duration":       987654321,
  "prompt_eval_count":   50,
  "eval_count":          200,
  "eval_duration":       12345678
}
```

### Tool Call Response

When the model decides to use a tool, `message.tool_calls` is set (content is empty/null):

```json
{
  "model": "...",
  "message": {
    "role": "assistant",
    "content": "",
    "tool_calls": [
      {
        "function": {
          "name": "get_weather",
          "arguments": { "location": "Berlin", "unit": "celsius" }
        }
      }
    ]
  },
  "done": false
}
```

Note: `tool_calls[].function.arguments` is an **object** (already parsed JSON), not a string.

### done_reason Values

| Value         | Meaning                          |
|---------------|----------------------------------|
| `stop`        | Natural end of generation        |
| `tool_calls`  | Model triggered a tool call      |
| `load`        | Model was loaded                 |
| `unload`      | Model was unloaded               |

## Tool Result Message

After receiving a tool call, send result as role `tool`:

```json
{
  "role": "tool",
  "content": "result text or JSON string"
}
```

## Non-Streaming Response

Single JSON object with all fields combined (same structure as final streaming line).
initiale version 2026-05-10 10:46:41 +02:00			`# Ollama Chat API`

			Endpoint: `POST /api/chat`

			`## Request`

			```json
			`{`
			`"model": "llama3.2",`
			`"messages": [`
			`{ "role": "system", "content": "string" },`
			`{ "role": "user", "content": "string" },`
			`{ "role": "assistant", "content": "string" },`
			`{ "role": "tool", "content": "string" }`
			`],`
			`"tools": [`
			`{`
			`"type": "function",`
			`"function": {`
			`"name": "string",`
			`"description": "string",`
			`"parameters": {`
			`"type": "object",`
			`"properties": { "param": { "type": "string", "description": "..." } },`
			`"required": ["param"]`
			`}`
			`}`
			`}`
			`],`
			`"think": false,`
			`"format": "json",`
			`"stream": true,`
			`"keep_alive": "5m",`
			`"options": {`
			`"temperature": 0.7,`
			`"num_ctx": 131072,`
			`"num_predict": 4096`
			`}`
			`}`
			```

			`### Fields`

			`\| Field \| Type \| Required \| Notes \|`
			`\|--------------\|---------\|----------\|------------------------------------------------\|`
			\| `model` \| string \| Yes \| Model name (e.g. `qwen3.6:35b-a3b-q4_K_M`) \|
			\| `messages` \| array \| Yes \| Conversation history \|
			\| `tools` \| array \| No \| Function definitions (OpenAI-compatible format)\|
			\| `think` \| boolean \| No \| Enable chain-of-thought (thinking models only) \|
			\| `format` \| string \| No \| `"json"` or JSON schema for structured output \|
			\| `stream` \| boolean \| No \| Default: `true` \|
			\| `keep_alive` \| string \| No \| How long to keep model loaded. Default: `5m` \|
			\| `options` \| object \| No \| Model runtime parameters \|

			`## Streaming Response (NDJSON)`

			`Each line is a standalone JSON object:`

			```json
			`{ "model": "...", "message": { "role": "assistant", "content": "partial text" }, "done": false }`
			```

			`Final line (done):`
			```json
			`{`
			`"model": "...",`
			`"message": { "role": "assistant", "content": "" },`
			`"done": true,`
			`"done_reason": "stop",`
			`"total_duration": 1234567890,`
			`"load_duration": 987654321,`
			`"prompt_eval_count": 50,`
			`"eval_count": 200,`
			`"eval_duration": 12345678`
			`}`
			```

			`### Tool Call Response`

			When the model decides to use a tool, `message.tool_calls` is set (content is empty/null):

			```json
			`{`
			`"model": "...",`
			`"message": {`
			`"role": "assistant",`
			`"content": "",`
			`"tool_calls": [`
			`{`
			`"function": {`
			`"name": "get_weather",`
			`"arguments": { "location": "Berlin", "unit": "celsius" }`
			`}`
			`}`
			`]`
			`},`
			`"done": false`
			`}`
			```

			Note: `tool_calls[].function.arguments` is an object (already parsed JSON), not a string.

			`### done_reason Values`

			`\| Value \| Meaning \|`
			`\|---------------\|----------------------------------\|`
			\| `stop` \| Natural end of generation \|
			\| `tool_calls` \| Model triggered a tool call \|
			\| `load` \| Model was loaded \|
			\| `unload` \| Model was unloaded \|

			`## Tool Result Message`

			After receiving a tool call, send result as role `tool`:

			```json
			`{`
			`"role": "tool",`
			`"content": "result text or JSON string"`
			`}`
			```

			`## Non-Streaming Response`

			`Single JSON object with all fields combined (same structure as final streaming line).`