Thursday, April 24, 2025

Tool Calling in LLM Applications explained

Tool calling is one of the most important features in modern LLM applications. It enables Large Language Models to interact with external systems and perform actions beyond mere text generation. Ofcourse, we could use Antrophics Model Context Protocol, but for the sake of simplicity and size, the example code does not use MCP.


What is Tool Calling?

Tool calling (also known as function calling) allows LLMs to send structured requests to external functions. Instead of just generating text, the model can call specific functions with defined parameters.


Advantages of Tool Calling

- Enhanced capabilities: LLMs can access current data

- More precise answers: Through access to specialized tools

- Automation: Enables execution of actions based on user requests


A Simple Example with Ollama

Here's a complete, easy-to-install example for tool calling with Ollama and Python:


import json

import requests

from datetime import datetime


# Function to get weather data

def get_weather(location):

    # In a real application, you would call a weather API here

    # This is a simplified simulation

    weather_data = {

        "location": location,

        "temperature": 22,

        "condition": "sunny",

        "humidity": 45,

        "timestamp": datetime.now().strftime("%Y-%m-%d %H:%M:%S")

    }

    return weather_data


# Function to call Ollama

def call_ollama(messages, tools=None):

    url = "http://localhost:11434/api/chat"

    

    payload = {

        "model": "llama3",  # or any other model you have in Ollama

        "messages": messages,

        "stream": False

    }

    

    if tools:

        payload["tools"] = tools

    

    response = requests.post(url, json=payload)

    return response.json()


# Main function to process user input

def process_user_input(user_input):

    # Define the tool

    tools = [

        {

            "type": "function",

            "function": {

                "name": "get_weather",

                "description": "Get current weather data for a specific location",

                "parameters": {

                    "type": "object",

                    "properties": {

                        "location": {

                            "type": "string",

                            "description": "The location to get weather data for, e.g. 'Berlin, Germany'"

                        }

                    },

                    "required": ["location"]

                }

            }

        }

    ]

    

    # First call to Ollama

    messages = [{"role": "user", "content": user_input}]

    response = call_ollama(messages, tools)

    

    # Check if a tool call was requested

    if "tool_calls" in response["message"]:

        tool_calls = response["message"]["tool_calls"]

        

        for tool_call in tool_calls:

            if tool_call["function"]["name"] == "get_weather":

                # Extract parameters and call function

                function_args = json.loads(tool_call["function"]["arguments"])

                location = function_args.get("location")

                weather_result = get_weather(location)

                

                # Send result back to Ollama

                messages.append(response["message"])

                messages.append({

                    "role": "tool",

                    "tool_call_id": tool_call["id"],

                    "name": "get_weather",

                    "content": json.dumps(weather_result)

                })

                

                second_response = call_ollama(messages)

                return second_response["message"]["content"]

        

    # If no tool call, return normal response

    return response["message"]["content"]


Example usage

if __name__ == "__main__":

    user_question = "What's the weather like in New York today?"

    response = process_user_input(user_question)

    print(response)


Installation and Setup

1. First, install Ollama:

   # For macOS

   curl -fsSL https://ollama.com/install.sh | sh

   

   # For Linux

   curl -fsSL https://ollama.com/install.sh | sh

   

   # For Windows, download from https://ollama.com/download

   


2. Pull a model that supports tool calling:

   ollama pull llama3


3. Install the required Python package:

   pip install requests


4. Save the code to a file (e.g., `ollama_tool_calling.py`)


5. Run the script:

   python ollama_tool_calling.py


How It Works

1. We define a tool (get_weather) with its parameters

2. The user asks a question about the weather

3. The LLM recognizes it should call the weather tool

4. We call the weather function with the parameters specified by the LLM

5. The result is sent back to the LLM

6. The LLM formulates a user-friendly response


Important Notes

- Ollama must be running on your local machine (it runs on port 11434 by default)

- Not all models in Ollama support tool calling. Llama3 is one that does

- The example uses a simulated weather function - in a real application, you would integrate with an actual weather API


Extension Possibilities

This basic framework can be easily extended:

- Integration with real weather APIs like OpenWeatherMap

- Adding more tools for calendar, notes, calculations, etc.

- Implementation of a user interface

- Adding persistent conversation history



How Tool Calling Works in the Ollama Example

In the example code, the decision to call a tool is made by the LLM itself, not explicitly by our application code. Let's break down how this works:


How the LLM Decides to Call a Tool


The LLM decides to call a tool based on:


1. The user's prompt

2. The available tools we've defined

3. The LLM's understanding of when to use those tools


When we send a request to the LLM, we include:

- The user's message

- A list of available tools with their descriptions and parameters


The LLM then determines if any of these tools would be helpful in responding to the user's query.


Example Prompt and Processing Flow

Let's use this example prompt:

"What's the current weather in Seattle?"


Here's how this is processed step by step:


1. Initial Request to Ollama


# User asks about weather

user_input = "What's the current weather in Seattle?"


# We prepare the messages array with the user's question

messages = [{"role": "user", "content": user_input}]


# We define the available tools

tools = [

    {

        "type": "function",

        "function": {

            "name": "get_weather",

            "description": "Get current weather data for a specific location",

            "parameters": {

                "type": "object",

                "properties": {

                    "location": {

                        "type": "string",

                        "description": "The location to get weather data for, e.g. 'Berlin, Germany'"

                    }

                },

                "required": ["location"]

            }

        }

    }

]


# We send this to Ollama

response = call_ollama(messages, tools)


2. LLM Analysis


When Ollama receives this request:

- It analyzes the user's question "What's the current weather in Seattle?"

- It sees we've provided a `get_weather` tool that can retrieve weather data

- It recognizes this is a perfect match - the user wants weather information, and we have a tool for that


3. Tool Call Response


The LLM decides to use the tool and returns a response like this:


{

  "message": {

    "role": "assistant",

    "content": "",

    "tool_calls": [

      {

        "id": "call_abc123",

        "type": "function",

        "function": {

          "name": "get_weather",

          "arguments": "{\"location\":\"Seattle\"}"

        }

      }

    ]

  }

}


Note that:

- The `content` field is empty because the LLM is requesting tool data before responding

- The `tool_calls` array contains the function to call with parameters


4. Application Processing


Our code detects the tool call:


if "tool_calls" in response["message"]:

    tool_calls = response["message"]["tool_calls"]

    

    for tool_call in tool_calls:

        if tool_call["function"]["name"] == "get_weather":

            # Extract parameters

            function_args = json.loads(tool_call["function"]["arguments"])

            location = function_args.get("location")  # This will be "Seattle"

            

            # Call our function

            weather_result = get_weather(location)


5. Tool Execution and Second Request


We execute the tool and send the results back:


# Add the assistant's response to the conversation

messages.append(response["message"])


# Add the tool response

messages.append({

    "role": "tool",

    "tool_call_id": tool_call["id"],  # "call_abc123"

    "name": "get_weather",

    "content": json.dumps(weather_result)

})


# Get the final response

second_response = call_ollama(messages)



The `weather_result` might look like:

{

  "location": "Seattle",

  "temperature": 22,

  "condition": "sunny",

  "humidity": 45,

  "timestamp": "2025-04-24 14:11:23"

}



6. Final Response


The LLM now has the weather data and can generate a proper response:


{

  "message": {

    "role": "assistant",

    "content": "The current weather in Seattle is sunny with a temperature of 22°C and humidity at 45%. This information was retrieved at 2025-04-24 14:11:23."

  }

}


Key Points About This Process

1. Automatic Detection: The LLM itself decides when to use a tool based on the user's query and available tools.


2. Semantic Understanding: The LLM understands that "What's the current weather in Seattle?" requires weather information, and matches this to the `get_weather` tool.


3. Parameter Extraction: The LLM extracts "Seattle" as the location parameter without explicit parsing rules in our code.


4. Two-Step Process: Tool calling is a two-step process:

   - First request: LLM decides to use a tool

   - Second request: LLM formulates a final response using the tool's output


5. No Explicit Rules: We don't write rules like "if the user asks about weather, call the weather function" - the LLM handles this intelligence.


Other Examples That Would Trigger Tool Calls

- "How hot is it in Tokyo right now?"

- "Tell me about the weather conditions in Paris"

- "Should I bring an umbrella in Chicago today?"

- "What's the temperature like in Miami?"


All of these would be recognized by the LLM as weather-related queries that could be answered using the `get_weather` tool.


Examples That Would NOT Trigger Tool Calls

- "What's the capital of France?" (Not weather-related)

- "How do I bake a chocolate cake?" (Not weather-related)

- "Tell me a joke about weather" (About weather but not requesting current conditions)


The LLM would answer these directly without calling the tool.


Conclusion

Tool calling is a powerful feature that transforms LLMs from mere text generators into interactive assistants. With the above example, you can quickly get started with tool calling using a local LLM through Ollama, avoiding the need for API keys or cloud services while maintaining privacy and reducing costs.

No comments: