feat: add tool calling support to m serve#850
feat: add tool calling support to m serve#850markstur wants to merge 1 commit intogenerative-computing:mainfrom
Conversation
Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>
|
The PR description has been updated. Please fill out the template for your PR to be reviewed. |
|
@markstur Do you want review comments yet or still WIP? |
Comments would be great! It is draft because I need to do more review/test myself on the generated code. I don't want to waste your time but comments early would be very welcome. |
psschwei
left a comment
There was a problem hiding this comment.
Code Review: feat: add tool calling support to m serve
Good feature PR — the core plumbing is correct and the OpenAI-compatible response format looks right. A couple of bugs to fix before merge, plus some improvements.
Summary
The implementation correctly wires tool calling through the serve endpoint: tools maps to ModelOption.TOOLS, tool_choice passes through as-is, and the response extracts tool calls from ModelOutputThunk into the OpenAI format. The Pydantic models mirror the OpenAI types well, and tests cover the main paths.
Two bugs need fixing (see inline comments):
- Empty
tool_callsdict produces incorrectfinish_reason: "tool_calls"with an empty array - Client example's multi-turn loop duplicates the assistant message for each tool call
Other improvements (see inline comments):
- Unused loop variable
tool_name eval()in example code with# noqasuppressing the security lint for copy-pasters- Missing test for the empty dict edge case
hasattrcheck is always true forModelOutputThunk— defensive but masks upstream bugs
What's working well
- Pydantic models (
ToolCallFunction,ChatCompletionMessageToolCall) closely match OpenAI types _build_model_optionschange is clean —toolsremoved from exclusion set, mapped toModelOption.TOOLS- 8 well-structured tests covering single/multiple tool calls, finish reasons, model_options passthrough, complex args, usage info, and backward compat
- Existing test updated consistently from "excluded" to "passed"
| tool_calls = None | ||
| finish_reason: Literal[ | ||
| "stop", "length", "content_filter", "tool_calls", "function_call" | ||
| ] = "stop" | ||
| if ( | ||
| hasattr(output, "tool_calls") | ||
| and output.tool_calls is not None | ||
| and isinstance(output.tool_calls, dict) | ||
| ): | ||
| tool_calls = [] | ||
| for tool_name, model_tool_call in output.tool_calls.items(): | ||
| # Generate a unique ID for this tool call | ||
| tool_call_id = f"call_{uuid.uuid4().hex[:24]}" | ||
|
|
||
| # Serialize the arguments to JSON string | ||
| args_json = json.dumps(model_tool_call.args) | ||
|
|
||
| tool_calls.append( | ||
| ChatCompletionMessageToolCall( | ||
| id=tool_call_id, | ||
| type="function", | ||
| function=ToolCallFunction( | ||
| name=model_tool_call.name, arguments=args_json | ||
| ), | ||
| ) | ||
| ) | ||
| finish_reason = "tool_calls" |
There was a problem hiding this comment.
Empty tool_calls dict produces wrong finish_reason
When output.tool_calls is {}, the code passes the isinstance(dict) check, creates an empty tool_calls = [], and sets finish_reason = "tool_calls". Per the OpenAI API, this should be "stop" with no tool calls. Fix: check if tool_calls: after the loop before setting the finish reason.
| and isinstance(output.tool_calls, dict) | ||
| ): | ||
| tool_calls = [] | ||
| for tool_name, model_tool_call in output.tool_calls.items(): |
There was a problem hiding this comment.
tool_name is never used. Use .values() instead.
There was a problem hiding this comment.
Could we add a test for output.tool_calls = {}, which is the trigger for the bug noted earlier
| tool_result = "Tool result" | ||
|
|
||
| # Add tool response to conversation | ||
| messages.append( |
There was a problem hiding this comment.
nit: if someone was using N tools, this would append the assistant message N times. could move it out of the tool call loop to prevent that (though it's an example, so not a major problem)
Misc PR
Type of PR
Description
Successfully added tool calling support to
m serveCLI with proper type annotations. Here's what was implemented:Changes Made
1. Updated Models (
cli/serve/models.py)ToolCallFunctionmodel for function details in tool callsChatCompletionMessageToolCallmodel for tool call structureChatCompletionMessageto include optionaltool_callsfieldChoice.finish_reasonto support"tool_calls"value2. Modified Server Logic (
cli/serve/app.py)jsonandLiteralimports for proper typing_build_model_options()to pass throughtools(mapped toModelOption.TOOLS) andtool_choiceparametersmake_chat_endpoint()to:ModelOutputThunk.tool_callswith proper type checking (isinstance(dict))call_<24-char-hex>finish_reasonwith properLiteraltype annotation3. Comprehensive Tests (
test/cli/test_serve_tool_calling.py)4. Updated Existing Test (
test/cli/test_serve.py)test_tool_params_excluded_from_model_optionstotest_tool_params_passed_to_model_options5. Example Code
docs/examples/m_serve/m_serve_example_tool_calling.py: Complete server example withGetWeatherToolandCalculatorToolimplementationsdocs/examples/m_serve/client_tool_calling.py: Client demonstrating how to call the tool-enabled server with various scenariosKey Features
✅ OpenAI-Compatible: Follows OpenAI's tool calling API format
✅ Type-Safe: Proper
Literaltype annotations forfinish_reason✅ Robust Type Checking: Uses
isinstance(dict)to avoid Mock object issues✅ Automatic Tool Call Detection: Extracts tool calls from
ModelOutputThunk✅ Proper Finish Reasons: Returns
"tool_calls"when tools are invoked,"stop"otherwise✅ Unique Tool Call IDs: Generates unique IDs in format
call_<24-char-hex>✅ JSON Serialization: Properly serializes tool arguments to JSON strings
✅ Backward Compatible: Works with existing code that doesn't use tools
✅ Fully Tested: All 43 serve tests pass, including 8 new tool-specific tests
✅ Type Checked: Passes mypy type checking
Usage
Start server with tool support:
Call with tools from client:
The implementation properly handles tool calls from Mellea's
ModelOutputThunkand formats them according to OpenAI's API specification with full type safety.Testing