Tool Calling — Continuum Inference API

How tool calling works

You define tools (functions) in your request. The model decides when to call them, generates the arguments, and returns a tool_calls response. Your code executes the function, returns the result, and the model incorporates it into its final answer.

1

You send a message with tool definitions

The model sees the function names, descriptions, and parameter schemas.

2

The model decides to call a tool

It returns finish_reason: "tool_calls" with the function name and JSON arguments.

3

Your code executes the function

You run the function with the provided arguments and get a result.

4

You return the result to the model

Send the tool result back as a message with role: "tool".

5

The model generates its final response

It incorporates the tool result into a natural language answer.

The model never executes functions itself. It only generates the call specification. Your code handles execution, which means you control security, validation, and error handling.

Defining tools

Tools are defined as an array of objects in the request body. Each tool has a type (always "function"), a name, a description, and a parameters schema in JSON Schema format.

tool_definition.py

tools = [ { "type": "function", "function": { "name": "get_stock_price", "description": "Get the current price for an ASX-listed stock", "parameters": { "type": "object", "properties": { "ticker": { "type": "string", "description": "ASX ticker symbol, e.g. CBA.AX" } }, "required": ["ticker"] } } } ] response = client.chat.completions.create( model="deepseek-v4-flash", messages=[ {"role": "user", "content": "What is the CBA share price?"} ], tools=tools )

Handling the response

When the model decides to call a tool, the response contains tool_calls instead of content. Execute the function, then return the result.

handle_tool_call.py

# Step 1: Check if the model wants to call a tool message = response.choices[0].message if message.tool_calls: # Step 2: Execute each tool call for tool_call in message.tool_calls: if tool_call.function.name == "get_stock_price": import json args = json.loads(tool_call.function.arguments) result = get_stock_price(args["ticker"]) # Your function # Step 3: Add the assistant message and tool result messages.append(message) messages.append({ "role": "tool", "tool_call_id": tool_call.id, "content": str(result) }) # Step 4: Send back for final response final_response = client.chat.completions.create( model="deepseek-v4-flash", messages=messages, tools=tools ) print(final_response.choices[0].message.content)

Strict mode

Set strict: true on a tool definition to guarantee the model output complies with your JSON schema. When strict mode is enabled:

•All properties must be listed in required
•additionalProperties must be set to false on every object
•The output is guaranteed to match your schema exactly
•Use $def and $ref for reusable schema modules and recursive structures

strict_mode.py

tools = [ { "type": "function", "function": { "name": "get_stock_price", "strict": true, # Guarantees schema compliance "description": "Get the current price for an ASX-listed stock", "parameters": { "type": "object", "properties": { "ticker": { "type": "string", "description": "ASX ticker symbol" } }, "required": ["ticker"], "additionalProperties": false # Required for strict mode } } } ]

Parallel calls and limits

The model can call multiple tools in a single response. When this happens, tool_calls contains multiple entries. Execute each one and return all results before sending the next request.

Limits

Max tools per request128

Parallel callsSupported

Strict modeSupported (beta)

Schema reuse ($ref)Supported

Tool calling with thinking mode

Thinking mode supports tool calls. The model can reason through a problem, decide to call tools, process the results, and continue reasoning before generating a final answer.

Important: When using tool calls within thinking mode, you must pass reasoning_content back to the API in all subsequent requests for turns that involved tool calls. Failure to do so returns a 400 error. For turns without tool calls, reasoning_content can be omitted.

See the thinking modes guide for full details on reasoning_content handling in multi-turn conversations.

Ready to connect your tools?

Start with the quickstart guide, then add tool definitions to your requests.

Quickstart API Reference