fix(diarization-ui): prevent repetition loops in Ollama generation

Adds repeat_penalty=1.15 and repeat_last_n=128 to suppress token repetition loops (e.g. "tragen" -> "tragen" -> ...). Also caps output via num_predict (default 4096, configurable via OLLAMA_NUM_PREDICT env var) as a hard stop in case the model still gets stuck. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 16:04:12 +02:00
parent 91b8522916
commit aae53d91b1
2 changed files with 8 additions and 1 deletions
--- a/.env.example
+++ b/.env.example
@@ -1,3 +1,4 @@
 API_BASE=http://gx10.aquantico.lan:8093
 OLLAMA_BASE_URL=http://gx10.aquantico.lan:11434
 OLLAMA_MODEL=qwen3.5:9b
+OLLAMA_NUM_PREDICT=4096