Skip to content

Commit 907fc8d

Browse files
committed
avoid to run on GPU, update system message
1 parent 08ab957 commit 907fc8d

File tree

4 files changed

+15
-4
lines changed

4 files changed

+15
-4
lines changed

notebooks/llm-chatbot/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ For more details, please refer to [model_card](https://huggingface.co/Qwen/Qwen2
8282
* **Qwen3-1.7/4B/8B/14B** - Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Building upon extensive advancements in training data, model architecture, and optimization techniques, Qwen3 delivers the following key improvements over the previously released Qwen2.5. You can find more info in [model card](https://huggingface.co/Qwen/Qwen3-8B).
8383
* **AFM-4.5B** - AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. You can find more info in [model card](https://huggingface.co/arcee-ai/AFM-4.5B).
8484
* **gpt-oss-20b** - gpt-oss-20b is a 20 billion parameter open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. You can find more info in [model card](https://huggingface.co/openai/gpt-oss-20b).
85+
>**Note**: gpt-oss-20b model is not supported with OpenVINO GPU plugin.
8586
8687
The image below illustrates the provided user instruction and model answer examples.
8788

notebooks/llm-chatbot/llm-chatbot-generate-api.ipynb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -447,6 +447,7 @@
447447
" * dataset: **wikitext2**\n",
448448
"* **AFM-4.5B** - AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. You can find more info in [model card](https://huggingface.co/arcee-ai/AFM-4.5B).\n",
449449
"* **gpt-oss-20b** - gpt-oss-20b is a 20 billion parameter open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. You can find more info in [model card](https://huggingface.co/openai/gpt-oss-20b).\n",
450+
">**Note**: gpt-oss-20b model is not supported with OpenVINO GPU plugin.\n",
450451
"</details>"
451452
]
452453
},

notebooks/llm-chatbot/llm-chatbot.ipynb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -342,6 +342,7 @@
342342
" * dataset: **wikitext2**\n",
343343
"* **AFM-4.5B** - AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. You can find more info in [model card](https://huggingface.co/arcee-ai/AFM-4.5B).\n",
344344
"* **gpt-oss-20b** - gpt-oss-20b is a 20 billion parameter open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. You can find more info in [model card](https://huggingface.co/openai/gpt-oss-20b).\n",
345+
">**Note**: gpt-oss-20b model is not supported with OpenVINO GPU plugin.\n",
345346
" </detals>\n"
346347
]
347348
},

utils/llm_config.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -499,7 +499,12 @@ def qwen_completion_to_prompt(completion):
499499
"remote_code": False,
500500
"start_message": DEFAULT_SYSTEM_PROMPT,
501501
},
502-
"gpt-oss-20b": {"model_id": "openai/gpt-oss-20b", "remote_code": False, "start_message": DEFAULT_SYSTEM_PROMPT},
502+
"gpt-oss-20b": {
503+
"model_id": "openai/gpt-oss-20b",
504+
"remote_code": False,
505+
"start_message": DEFAULT_SYSTEM_PROMPT + " You should not show your reasoning steps. Reasoning: low.",
506+
"exclude_on_devices": ["AUTO"],
507+
},
503508
},
504509
"Chinese": {
505510
"minicpm4-8b": {"model_id": "openbmb/MiniCPM4-8B", "remote_code": True, "start_message": DEFAULT_SYSTEM_PROMPT_CHINESE},
@@ -858,15 +863,18 @@ def get_llm_selection_widget(languages=list(SUPPORTED_LLM_MODELS), models=SUPPOR
858863

859864
lang_dropdown = widgets.Dropdown(options=languages or [])
860865

861-
# Define dependent drop down
866+
filter_models_by_device = lambda model_info: device not in model_info[1].get("exclude_on_devices", [])
862867

863-
model_dropdown = widgets.Dropdown(options=models)
868+
# Define dependent drop down
869+
supported_models = dict(filter(filter_models_by_device, models.items()))
870+
model_dropdown = widgets.Dropdown(options=supported_models)
864871

865872
def dropdown_handler(change):
866873
global default_language
867874
default_language = change.new
868875
# If statement checking on dropdown value and changing options of the dependent dropdown accordingly
869-
model_dropdown.options = SUPPORTED_LLM_MODELS[change.new]
876+
supported_models = SUPPORTED_LLM_MODELS[change.new]
877+
model_dropdown.options = dict(filter(filter_models_by_device, supported_models.items()))
870878

871879
lang_dropdown.observe(dropdown_handler, names="value")
872880
compression_dropdown = widgets.Dropdown(options=SUPPORTED_OPTIMIZATIONS if device != "NPU" else ["INT4-NPU", "FP16"])

0 commit comments

Comments
 (0)