Cloud models are set to their maximum context length by default.
If so, why would I need to set anything ever?
Cloud models are set to their maximum context length by default.
If so, why would I need to set anything ever?
Context length is the maximum number of tokens that the model has access to in memory. The default context length in Ollama is 4096 tokens. Tasks which require large context like web search, agents, and coding tools should be set to at least 64000 tokens.
Default ollama context length is 4k. Recommended minimum for websearch, agents and coding tools (like Claude Code or Open code) is 64k. I've seen 128k recommendations for Claude Code
Note: Claude Code requires a large context window. We recommend at least 64k tokens. See the context length documentation for how to adjust context length in Ollama.
how to set context length in ollama?