From 9fcc2445129fee6ea66f6bb81d5e4c1c3cef4718 Mon Sep 17 00:00:00 2001 From: Eden Chan Date: Wed, 15 Oct 2025 16:51:02 -0700 Subject: [PATCH 1/2] docs(readme): add Cloud Models usage for Python and Cloud API example\n\n- Add local offload flow (signin, pull, run)\n- Add direct cloud API usage with auth\n- List supported cloud model IDs\n- Keep examples minimal; match existing style --- README.md | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/README.md b/README.md index 9995f53b..7bdbd84c 100644 --- a/README.md +++ b/README.md @@ -50,6 +50,82 @@ for chunk in stream: print(chunk['message']['content'], end='', flush=True) ``` +## Cloud Models + +Run larger models by offloading to Ollama’s cloud while keeping your local workflow. + +- Supported models: `deepseek-v3.1:671b-cloud`, `gpt-oss:20b-cloud`, `gpt-oss:120b-cloud`, `kimi-k2:1t-cloud`, `qwen3-coder:480b-cloud` + +### Run via local Ollama + +1) Sign in (one-time): + +``` +ollama signin +``` + +2) Pull a cloud model: + +``` +ollama pull gpt-oss:120b-cloud +``` + +3) Use as usual (offloads automatically): + +```python +from ollama import Client + +client = Client() + +messages = [ + { + 'role': 'user', + 'content': 'Why is the sky blue?', + }, +] + +for part in client.chat('gpt-oss:120b-cloud', messages=messages, stream=True): + print(part['message']['content'], end='', flush=True) +``` + +### Cloud API (ollama.com) + +Access cloud models directly by pointing the client at `https://ollama.com`. + +1) Create an API key, then set: + +``` +export OLLAMA_API_KEY=your_api_key +``` + +2) (Optional) List models available via the API: + +``` +curl https://ollama.com/api/tags +``` + +3) Generate a response via the cloud API: + +```python +import os +from ollama import Client + +client = Client( + host='https://ollama.com', + headers={'Authorization': 'Bearer ' + os.environ.get('OLLAMA_API_KEY')} +) + +messages = [ + { + 'role': 'user', + 'content': 'Why is the sky blue?', + }, +] + +for part in client.chat('gpt-oss:120b', messages=messages, stream=True): + print(part['message']['content'], end='', flush=True) +``` + ## Custom client A custom client can be created by instantiating `Client` or `AsyncClient` from `ollama`. From 615b3c944e8ad5a16a57d6507e338a904bb15b7b Mon Sep 17 00:00:00 2001 From: Parth Sareen Date: Thu, 13 Nov 2025 13:41:49 -0800 Subject: [PATCH 2/2] Apply suggestions from code review --- README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 7bdbd84c..482c34b8 100644 --- a/README.md +++ b/README.md @@ -54,7 +54,7 @@ for chunk in stream: Run larger models by offloading to Ollama’s cloud while keeping your local workflow. -- Supported models: `deepseek-v3.1:671b-cloud`, `gpt-oss:20b-cloud`, `gpt-oss:120b-cloud`, `kimi-k2:1t-cloud`, `qwen3-coder:480b-cloud` +- Supported models: `deepseek-v3.1:671b-cloud`, `gpt-oss:20b-cloud`, `gpt-oss:120b-cloud`, `kimi-k2:1t-cloud`, `qwen3-coder:480b-cloud`, `kimi-k2-thinking` See [Ollama Models - Cloud](https://ollama.com/search?c=cloud) for more information ### Run via local Ollama @@ -70,7 +70,7 @@ ollama signin ollama pull gpt-oss:120b-cloud ``` -3) Use as usual (offloads automatically): +3) Make a request: ```python from ollama import Client @@ -85,14 +85,14 @@ messages = [ ] for part in client.chat('gpt-oss:120b-cloud', messages=messages, stream=True): - print(part['message']['content'], end='', flush=True) + print(part.message.content, end='', flush=True) ``` ### Cloud API (ollama.com) Access cloud models directly by pointing the client at `https://ollama.com`. -1) Create an API key, then set: +1) Create an API key from [ollama.com](https://ollama.com/settings/keys) , then set: ``` export OLLAMA_API_KEY=your_api_key @@ -123,7 +123,7 @@ messages = [ ] for part in client.chat('gpt-oss:120b', messages=messages, stream=True): - print(part['message']['content'], end='', flush=True) + print(part.message.content, end='', flush=True) ``` ## Custom client