Adds repeat_penalty=1.15 and repeat_last_n=128 to suppress token repetition loops (e.g. "tragen" -> "tragen" -> ...). Also caps output via num_predict (default 4096, configurable via OLLAMA_NUM_PREDICT env var) as a hard stop in case the model still gets stuck. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
74 KiB
74 KiB