Ollama Chat API

Endpoint: POST /api/chat

Request

{
  "model": "llama3.2",
  "messages": [
    { "role": "system",    "content": "string" },
    { "role": "user",      "content": "string" },
    { "role": "assistant", "content": "string" },
    { "role": "tool",      "content": "string" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "string",
        "description": "string",
        "parameters": {
          "type": "object",
          "properties": { "param": { "type": "string", "description": "..." } },
          "required": ["param"]
        }
      }
    }
  ],
  "think":     false,
  "format":    "json",
  "stream":    true,
  "keep_alive": "5m",
  "options": {
    "temperature": 0.7,
    "num_ctx":     131072,
    "num_predict": 4096
  }
}

Fields

Field	Type	Required	Notes
`model`	string	Yes	Model name (e.g. `qwen3.6:35b-a3b-q4_K_M`)
`messages`	array	Yes	Conversation history
`tools`	array	No	Function definitions (OpenAI-compatible format)
`think`	boolean	No	Enable chain-of-thought (thinking models only)
`format`	string	No	`"json"` or JSON schema for structured output
`stream`	boolean	No	Default: `true`
`keep_alive`	string	No	How long to keep model loaded. Default: `5m`
`options`	object	No	Model runtime parameters

Streaming Response (NDJSON)

Each line is a standalone JSON object:

{ "model": "...", "message": { "role": "assistant", "content": "partial text" }, "done": false }

Final line (done):

{
  "model": "...",
  "message": { "role": "assistant", "content": "" },
  "done": true,
  "done_reason": "stop",
  "total_duration":      1234567890,
  "load_duration":       987654321,
  "prompt_eval_count":   50,
  "eval_count":          200,
  "eval_duration":       12345678
}

Tool Call Response

When the model decides to use a tool, message.tool_calls is set (content is empty/null):

{
  "model": "...",
  "message": {
    "role": "assistant",
    "content": "",
    "tool_calls": [
      {
        "function": {
          "name": "get_weather",
          "arguments": { "location": "Berlin", "unit": "celsius" }
        }
      }
    ]
  },
  "done": false
}

Note: tool_calls[].function.arguments is an object (already parsed JSON), not a string.

done_reason Values

Value	Meaning
`stop`	Natural end of generation
`tool_calls`	Model triggered a tool call
`load`	Model was loaded
`unload`	Model was unloaded

Tool Result Message

After receiving a tool call, send result as role tool:

{
  "role": "tool",
  "content": "result text or JSON string"
}

Non-Streaming Response

Single JSON object with all fields combined (same structure as final streaming line).

3.2 KiB Raw Blame History