Files
diarization-ui/.env.example
wb 39250e6582 fix(diarization-ui): raise default num_predict to 16384
Thinking tokens count against num_predict. At 4096 the model was
running out mid-response after spending ~3000 tokens on thinking.
16384 gives enough headroom for thinking + full response.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 16:26:55 +02:00

6 lines
155 B
Plaintext

API_BASE=http://gx10.aquantico.lan:8093
OLLAMA_BASE_URL=http://gx10.aquantico.lan:11434
OLLAMA_MODEL=qwen3.5:9b
OLLAMA_NUM_PREDICT=16384
OLLAMA_THINK=true