Files
Antrophic-Qwen3.6-Proxy/doc/ollama-api.md
2026-05-10 10:46:41 +02:00

3.2 KiB

Ollama Chat API

Endpoint: POST /api/chat

Request

{
  "model": "llama3.2",
  "messages": [
    { "role": "system",    "content": "string" },
    { "role": "user",      "content": "string" },
    { "role": "assistant", "content": "string" },
    { "role": "tool",      "content": "string" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "string",
        "description": "string",
        "parameters": {
          "type": "object",
          "properties": { "param": { "type": "string", "description": "..." } },
          "required": ["param"]
        }
      }
    }
  ],
  "think":     false,
  "format":    "json",
  "stream":    true,
  "keep_alive": "5m",
  "options": {
    "temperature": 0.7,
    "num_ctx":     131072,
    "num_predict": 4096
  }
}

Fields

Field Type Required Notes
model string Yes Model name (e.g. qwen3.6:35b-a3b-q4_K_M)
messages array Yes Conversation history
tools array No Function definitions (OpenAI-compatible format)
think boolean No Enable chain-of-thought (thinking models only)
format string No "json" or JSON schema for structured output
stream boolean No Default: true
keep_alive string No How long to keep model loaded. Default: 5m
options object No Model runtime parameters

Streaming Response (NDJSON)

Each line is a standalone JSON object:

{ "model": "...", "message": { "role": "assistant", "content": "partial text" }, "done": false }

Final line (done):

{
  "model": "...",
  "message": { "role": "assistant", "content": "" },
  "done": true,
  "done_reason": "stop",
  "total_duration":      1234567890,
  "load_duration":       987654321,
  "prompt_eval_count":   50,
  "eval_count":          200,
  "eval_duration":       12345678
}

Tool Call Response

When the model decides to use a tool, message.tool_calls is set (content is empty/null):

{
  "model": "...",
  "message": {
    "role": "assistant",
    "content": "",
    "tool_calls": [
      {
        "function": {
          "name": "get_weather",
          "arguments": { "location": "Berlin", "unit": "celsius" }
        }
      }
    ]
  },
  "done": false
}

Note: tool_calls[].function.arguments is an object (already parsed JSON), not a string.

done_reason Values

Value Meaning
stop Natural end of generation
tool_calls Model triggered a tool call
load Model was loaded
unload Model was unloaded

Tool Result Message

After receiving a tool call, send result as role tool:

{
  "role": "tool",
  "content": "result text or JSON string"
}

Non-Streaming Response

Single JSON object with all fields combined (same structure as final streaming line).