Chapter One: Introduction to the Model Context Protocol
What is the Model Context Protocol?
I already wrote about the Model Context Protocol (aka MCP) in previous posts. Now, I have rewritten my tutorial to include the new version of MCP which has been published on the 25th November, 2025, and which includes are bunch of new features.
The Model Context Protocol, commonly referred to as MCP, is an open standard protocol that enables secure and structured communication between Large Language Model applications and external data sources, tools, and services. Think of MCP as a universal translator that allows AI systems to safely interact with the world beyond their training data.
In practical terms, MCP solves a fundamental challenge in AI development. While Large Language Models are incredibly powerful at understanding and generating text, they are inherently limited to the knowledge they were trained on. They cannot access real-time information, query databases, read files from your computer, or interact with external APIs unless there is a standardized way to connect them to these resources. This is exactly what MCP provides.
The protocol was developed by Anthropic and released publicly in late 2024. It has since gained significant traction in the AI development community because it addresses a critical need for standardization. Before MCP, every AI application had to implement its own custom methods for connecting to external tools and data sources. This led to fragmentation, security concerns, and significant duplication of effort across the industry.
MCP draws inspiration from the Language Server Protocol, which successfully standardized how programming tools communicate with code editors. Just as LSP allows any editor to work with any programming language through a common interface, MCP allows any AI application to work with any external tool or data source through a standardized protocol.
Understanding MCP Version 2025-11-25
This is the latest stable release, published on the one-year anniversary of MCP going public. The version number uses a date-based format in the pattern YYYY-MM-DD, which indicates the last date when backward-incompatible changes were made to the specification. This versioning approach makes it immediately clear which version you are working with and when it was finalized.
The 2025-11-25 specification introduces several powerful new capabilities that significantly enhance the protocol's functionality. First, it includes OpenID Connect Discovery support, which provides enhanced security mechanisms for authentication and authorization. This makes it easier to integrate MCP servers with enterprise identity management systems.
Second, the specification adds icons metadata for tools, resources, and prompts. This seemingly small feature has important implications for user interface design, allowing client applications to display visually rich representations of available capabilities. When you see a list of tools in Claude Desktop or another MCP host, these icons help you quickly identify what each tool does.
Third, the specification introduces incremental scope consent, which provides fine-grained control over permissions. Instead of granting blanket access to all capabilities, users can now approve specific permissions as they are needed. This follows the principle of least privilege and enhances security in enterprise deployments.
Fourth, and perhaps most exciting, is the addition of sampling tool calling support. This enables bidirectional communication where MCP servers can request LLM completions from the host. In other words, your server-side tools can now ask the AI to help with tasks, creating powerful recursive workflows where tools and AI collaborate on complex problems.
Finally, the specification includes experimental Tasks support for handling asynchronous and long-running workflows. This is crucial for real-world applications where some operations might take seconds, minutes, or even hours to complete. Instead of blocking and waiting, the client can receive a task handle, check its status periodically, and retrieve results when ready.
These features collectively make MCP an even more robust protocol for building production-grade AI integrations that can handle the complexity and security requirements of enterprise environments.
The Architecture of MCP
To understand how MCP works, we need to examine its architecture, which consists of three primary components that work together to enable AI applications to interact with external capabilities.
The first component is the MCP Host. The host is the environment where your Large Language Model runs and where users interact with the AI. Examples of MCP hosts include Claude Desktop, Visual Studio Code with AI extensions, the Zed editor, and Sourcegraph Cody. The host is responsible for managing the overall user experience, maintaining the conversation context, and coordinating between the user, the AI model, and external capabilities provided by MCP servers. When you type a question into Claude Desktop and it needs to access external information, the host orchestrates this entire process.
The second component is the MCP Client. The client sits inside the host application and handles all the protocol-level communication details. When the host application starts, the client discovers available MCP servers, establishes connections with them, and requests metadata about what tools and resources each server provides. During operation, the client translates the AI model's intentions into structured JSON-RPC requests that servers can understand, sends these requests over the appropriate transport mechanism, and processes the responses. Think of the client as a diplomatic translator who ensures both sides speak the same language and follow the same rules of conversation.
The third component is the MCP Server. This is what we will build in this tutorial. The server exposes tools, resources, and prompts to the AI system. A tool is an invokable function that performs some action, such as searching the web, querying a database, or performing a calculation. A resource is read-only data that the AI can access, such as file contents, API documentation, or system information. A prompt is a reusable template that helps structure interactions with the AI for specific tasks. The server is where your custom business logic lives and where you integrate with existing systems. It might connect to databases, call external APIs, access file systems, or perform any other operation you want to make available to your AI.
How Communication Flows in MCP
Understanding the communication flow helps clarify how these components work together. When a user interacts with an MCP host application and makes a request, the host's Large Language Model analyzes the request and determines whether it needs external information or capabilities to provide a complete answer. If external capabilities are needed, the host instructs its MCP client to interact with the appropriate server.
The MCP client then sends a JSON-RPC request to the appropriate MCP server. JSON-RPC is a remote procedure call protocol that uses JSON for encoding messages. It provides a standardized way to invoke methods on remote systems and receive responses. The request specifies which tool to invoke and what parameters to pass.
The server receives this request, validates the parameters, performs the necessary operations, and sends back a JSON-RPC response containing the results. This might involve querying a database, calling an external API, reading files, or performing computations. The server handles all the complexity of these operations and returns a structured response.
The client receives this response and makes the information available to the Large Language Model running in the host. The LLM then incorporates this information into its reasoning process and generates a response to the user that includes the external information. From the user's perspective, this entire process happens seamlessly, as if the AI naturally had access to this information all along.
Transport Methods: Connecting Locally and Remotely
MCP supports two fundamentally different transport methods for communication between clients and servers, each suited to different deployment scenarios.
The first transport method is STDIO, which stands for Standard Input and Output. In this mode, communication happens over standard input and output streams, which are the same mechanisms that command-line programs use to read input from the keyboard and write output to the screen. When using STDIO transport, the MCP server runs as a subprocess on the same machine as the client. The client launches the server process, and they communicate by writing JSON-RPC messages to each other's input streams.
STDIO transport is best suited for desktop applications, IDE extensions, and local development scenarios. It provides inherent security because the communication never leaves the local machine. There is no network exposure, no need for authentication mechanisms, and no risk of eavesdropping. Configuration is simple because you just specify the command to run the server and any arguments it needs. This is the most common transport method for personal productivity tools like Claude Desktop.
The second transport method is HTTP with Server-Sent Events, commonly abbreviated as SSE. In this mode, communication happens over HTTP or HTTPS, which means the server runs as a web service that can be accessed over the network. The client connects to the server using a URL, and they exchange JSON-RPC messages over HTTP. Server-Sent Events provide a mechanism for the server to push messages to the client in real-time, which is necessary for the bidirectional communication that MCP requires.
HTTP transport is best suited for team collaboration scenarios, cloud deployments, and enterprise integrations where multiple users or systems need to access the same MCP server. It enables centralized deployment where you can run one server instance that serves many clients. However, this convenience comes with additional security responsibilities. You must implement authentication to verify client identity, use HTTPS to encrypt data in transit, implement proper access controls to limit what each client can do, and protect against various network-based attacks. Configuration requires specifying the server URL and any authentication headers needed to access it.
Why MCP Matters for AI Development
The Model Context Protocol represents a significant step forward in AI application development for several important reasons. First, it provides standardization in an area that previously lacked it. Before MCP, every AI application implemented its own custom methods for connecting to external tools and data sources. This meant that a tool built for one AI application could not be used with another without significant rework. MCP changes this by providing a common protocol that any compliant application can use.
Second, MCP enhances security by providing a structured framework for AI interactions with external systems. The protocol includes mechanisms for authentication, authorization, input validation, and sandboxing. This is crucial because allowing an AI to execute arbitrary code or access arbitrary resources would be extremely dangerous. MCP provides the guardrails needed to make these interactions safe.
Third, MCP enables composability. You can build a library of MCP servers that provide different capabilities, and any MCP-compatible host can use them. This creates an ecosystem where developers can share and reuse tools, much like how npm packages work for JavaScript or pip packages work for Python. Instead of every developer building the same database connector or file reader, they can use existing, well-tested MCP servers.
Fourth, MCP separates concerns cleanly. The AI model focuses on understanding and generating natural language. The MCP server focuses on interacting with external systems and performing specific tasks. The MCP client handles the protocol-level communication details. This separation makes systems easier to build, test, and maintain.
Finally, MCP is open and vendor-neutral. While Anthropic developed it, the specification is publicly available and anyone can implement it.
This prevents vendor lock-in and ensures that the protocol can evolve based on community needs rather than the interests of a single company.
Chapter Two: Prerequisites and Environment Setup
Before we begin building MCP servers and clients, we need to set up a proper development environment with all the necessary tools and dependencies. This chapter will guide you through installing and configuring everything you need.
Installing Python
We will be using Python for this tutorial because it offers excellent integration with AI and machine learning libraries, has minimal boilerplate code, and allows for rapid development. The concepts we cover apply equally to TypeScript implementations, but Python's simplicity makes it ideal for learning.
You need Python version 3.10 or higher installed on your system. Python 3.10 introduced several features that the MCP SDK relies on, including improved type hints and pattern matching. To verify whether you have Python installed and which version you have, open a terminal or command prompt and run the command python --version or python3 --version depending on your operating system.
If you see a version number of 3.10 or higher, you are ready to proceed. If you do not have Python installed or have an older version, visit python.org and download the latest stable release. During installation on Windows, make sure to check the box that says "Add Python to PATH" so you can run Python from any directory in your terminal.
Installing uv Package Manager
We will use uv as our Python package manager instead of the traditional pip tool. uv is a modern package manager written in Rust that is significantly faster than pip and provides better dependency resolution. It also creates isolated virtual environments automatically, which prevents dependency conflicts between different projects.
To install uv, visit docs.astral.sh/uv and follow the installation instructions for your operating system. On macOS and Linux, you can typically install it with a single command using curl or wget. On Windows, you can use the PowerShell installer or download a standalone executable.
After installation, verify that uv is working by running uv --version in your terminal. You should see the version number displayed. If you prefer to use pip instead of uv, that is perfectly fine. The examples in this tutorial use uv syntax, but you can easily translate them to pip commands. Where we use uv add to install a package, you would use pip install. Where we use uv run to execute a script, you would activate the virtual environment and run the script directly.
Installing Ollama for Local Testing
Ollama is a tool that allows you to run Large Language Models locally on your machine. While it is not strictly required for building MCP servers, it is extremely useful for testing because you can see how your servers interact with an actual AI model without needing to use cloud services or API keys.
To install Ollama, visit ollama.ai and download the installer for your operating system. The installation process is straightforward and similar to installing any other application. After installation, you need to pull at least one language model to your local machine. Open a terminal and run the command ollama pull llama2 to download the Llama 2 model, which is a capable open-source model suitable for testing.
You can also pull other models like Mistral or Mixtral depending on your hardware capabilities and preferences. Larger models generally provide better results but require more memory and processing power. Once you have pulled a model, you can start it by running ollama run llama2, which will give you an interactive chat interface where you can test the model.
Choosing a Text Editor or IDE
You will need a text editor or integrated development environment to write your code. Visual Studio Code is highly recommended because it has excellent Python support, including features like intelligent code completion, integrated debugging, and built-in terminal access. It also has extensions for working with MCP servers and will eventually be able to use the MCP servers we build directly.
To install Visual Studio Code, visit code.visualstudio.com and download the installer for your operating system. After installation, open VS Code and install the Python extension from the Extensions marketplace. This extension provides syntax highlighting, linting, debugging, and many other features that make Python development more productive.
If you prefer other editors like PyCharm, Sublime Text, or Vim, they will work perfectly fine as well. The important thing is that you have an environment where you can comfortably write and test Python code.
Understanding the Development Workflow
Before we start coding, it helps to understand the typical development workflow we will follow.
First, we will create a new directory for our project and initialize it as a Python project using uv. This creates the necessary configuration files and sets up dependency management.
Second, we will install the MCP SDK and any other libraries we need for our specific tools. The SDK provides all the foundational classes and utilities needed to build compliant MCP servers and clients.
Third, we will write our server code, defining the tools, resources, and prompts we want to expose. We will implement the business logic for each tool and set up the appropriate handlers.
Fourth, we will test our server using the MCP Inspector tool, which provides a visual interface for interacting with MCP servers and verifying they work correctly. This allows us to test individual tools and see exactly what requests and responses look like.
Fifth, we will integrate our server with an MCP host like Claude Desktop so we can see it working in a real AI application. This is where everything comes together and you can have natural language conversations that leverage your custom tools.
Finally, for production deployments, we will add security features, error handling, logging, and deploy the server to a remote environment where it can be accessed by multiple clients.
Chapter Three: Building Your First Local MCP Server
Now that we have our development environment set up, we can begin building our first MCP server. We will start with a local server that uses STDIO transport, which is the simplest deployment model and perfect for learning the fundamentals.
Creating the Project Structure
Open your terminal and navigate to a directory where you want to create your project. Execute the command mkdir mcp-tutorial to create a new directory for our work. Then navigate into this directory with cd mcp-tutorial. Now we will initialize a new Python project by running uv init mcp-server. This command creates a new subdirectory called mcp-server with the proper project structure.
Navigate into the mcp-server directory with cd mcp-server. If you list the contents of this directory, you will see that uv has created a pyproject.toml file, which is the modern Python standard for project configuration and dependency management. This file will track all the libraries our project depends on.
Installing the MCP SDK
Now we need to install the MCP SDK, which is the official Python library for building MCP servers and clients. Run the command uv add mcp in your terminal. This command does several things automatically. It downloads the MCP SDK from the Python Package Index, installs it in an isolated virtual environment specific to this project, and updates the pyproject.toml file to record this dependency.
The isolation is important because it means the packages installed for this project will not interfere with other Python projects on your system. Each project has its own independent set of dependencies, preventing version conflicts and making your projects more reproducible.
Installing Additional Dependencies
For our example server, we will build tools that demonstrate different capabilities. We need a few additional libraries beyond the MCP SDK. Run the command uv add httpx pydantic to install these libraries.
The httpx library is a modern HTTP client for Python that supports async operations. We will use it if we want our tools to make web requests. It is similar to the popular requests library but designed for asynchronous programming, which is important because MCP servers handle requests asynchronously.
The pydantic library provides powerful data validation and serialization capabilities. It is particularly important for MCP because the protocol uses JSON Schema to describe tool inputs, and pydantic models automatically generate compatible schemas. This means we can define our tool inputs as Python classes, and pydantic will handle all the validation and schema generation for us.
Understanding the Server Foundation
Every MCP server needs to handle several core responsibilities. It must advertise what capabilities it provides when a client connects. It must accept incoming requests and validate that they are properly formatted and contain valid parameters. It must execute the requested operations, which might involve calling external APIs, querying databases, or performing computations. Finally, it must return properly formatted responses that comply with the MCP specification.
The MCP SDK provides classes that handle most of the protocol-level details automatically. This allows us to focus on implementing our actual business logic rather than worrying about JSON-RPC message formatting, protocol versioning, or connection management. The SDK takes care of these concerns, and we just need to implement handler functions for our specific tools and resources.
Creating the Server File
Create a new file called server.py in your mcp-server directory. This will be the main file containing our server implementation. Open this file in your text editor and we will build it step by step.
Step One: Importing Required Modules
At the top of server.py, we need to import all the modules and classes we will use. Add the following code:
import asyncio
import json
from typing import Any, Sequence
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import (
Tool,
TextContent,
ImageContent,
EmbeddedResource,
)
import httpx
from pydantic import BaseModel, Field, validator
Let me explain what each of these imports provides. The asyncio module is Python's built-in library for asynchronous programming. MCP servers are asynchronous by design because they need to handle multiple requests concurrently without blocking. When one tool is waiting for a database query to complete, the server can process other requests in the meantime.
The json module provides functions for working with JSON data, which we will use when formatting certain responses. The typing module provides type hints that make our code more readable and enable better IDE support and error checking.
From the mcp.server module, we import the Server class, which is the core class that manages the server lifecycle and routes requests to the appropriate handlers. From mcp.server.stdio, we import the stdio_server function, which sets up communication over standard input and output streams.
From mcp.types, we import several data structure classes defined by the MCP specification. Tool represents a callable function that the AI can invoke. TextContent represents textual content that can be returned from tool calls. ImageContent represents image data, and EmbeddedResource represents embedded resources. These types ensure our server communicates using the exact format that MCP clients expect.
The httpx import gives us the HTTP client library for making web requests if our tools need to call external APIs. The pydantic imports provide the BaseModel class for defining data models, the Field function for adding metadata to model fields, and the validator decorator for implementing custom validation logic.
Step Two: Creating the Server Instance
After the imports, add this line to create the server instance:
server = Server("example-server")
This creates a new MCP server with the name "example-server". The name is used for identification in logs and debugging. When a client connects, it can see this name to understand which server it is talking to. You should choose a descriptive name that indicates what your server does, especially if you plan to run multiple MCP servers.
Step Three: Defining Input Models with Pydantic
Now we will define the input models for our tools using Pydantic. These models serve two purposes. First, they provide automatic validation of incoming parameters, ensuring that tools receive data in the expected format. Second, they automatically generate JSON Schema definitions that describe the expected inputs, which the AI uses to understand how to call our tools.
Add the following code to define our first input model:
class CalculatorInput(BaseModel):
"""Input for the calculator tool"""
operation: str = Field(description="The operation to perform: add, subtract, multiply, divide")
a: float = Field(description="First number")
b: float = Field(description="Second number")
This model defines the input for a calculator tool. It has three fields. The operation field is a string that specifies which arithmetic operation to perform. The a and b fields are floating-point numbers that will be the operands. The Field function allows us to provide descriptions for each parameter, which help the AI understand what values to provide.
Now add the model for a web search tool:
class WebSearchInput(BaseModel):
"""Input for the web search tool"""
query: str = Field(description="The search query")
max_results: int = Field(default=5, description="Maximum number of results to return")
This model has two fields. The query field is the search string. The max_results field specifies how many results to return, with a default value of 5 if not specified. Default values are useful because they make parameters optional, allowing the AI to omit them when they are not critical.
Finally, add the model for a file reading tool:
class FileReadInput(BaseModel):
"""Input for the file read tool"""
path: str = Field(description="Path to the file to read")
@validator('path')
def validate_path(cls, v):
"""Validate path to prevent directory traversal attacks"""
import os
if '..' in v:
raise ValueError("Path traversal not allowed")
return v
This model demonstrates custom validation using the validator decorator. The validate_path method checks the path parameter to ensure it does not contain ".." sequences, which could be used for directory traversal attacks. This is a simple but important security measure that prevents malicious users from accessing files outside the intended directory.
Step Four: Implementing the List Tools Handler
Now we need to implement a handler that tells clients what tools are available. Add this code:
@server.list_tools()
async def list_tools() -> list[Tool]:
"""
List all available tools.
This handler is called when the client wants to know what tools are available.
"""
return [
Tool(
name="calculator",
description="Perform basic arithmetic operations (add, subtract, multiply, divide)",
inputSchema=CalculatorInput.model_json_schema(),
),
Tool(
name="web_search",
description="Search the web for information using a query string",
inputSchema=WebSearchInput.model_json_schema(),
),
Tool(
name="read_file",
description="Read the contents of a file from the local filesystem",
inputSchema=FileReadInput.model_json_schema(),
),
]
The @server.list_tools() decorator registers this function as the handler for tool listing requests. When an MCP client connects to our server, one of the first things it does is call this handler to discover what tools are available.
The function returns a list of Tool objects. Each Tool has three important properties. The name is a unique identifier that clients use to invoke the tool. The description is a human-readable explanation that helps the AI understand when to use this tool. The inputSchema is a JSON Schema object that describes the expected parameters, which we generate automatically from our Pydantic models using the model_json_schema() method.
Step Five: Implementing the Calculator Tool Logic
Now we implement the actual business logic for the calculator tool. Add this function:
async def calculator_tool(operation: str, a: float, b: float) -> str:
"""Execute calculator operations"""
operations = {
"add": lambda x, y: x + y,
"subtract": lambda x, y: x - y,
"multiply": lambda x, y: x * y,
"divide": lambda x, y: x / y if y != 0 else "Error: Division by zero",
}
if operation not in operations:
return f"Error: Unknown operation '{operation}'. Supported operations: {', '.join(operations.keys())}"
try:
result = operations[operation](a, b)
if isinstance(result, str):
return result
return f"Result: {a} {operation} {b} = {result}"
except Exception as e:
return f"Error: {str(e)}"
This function implements a simple calculator. It defines a dictionary mapping operation names to lambda functions that perform the actual calculations. For division, we check if the divisor is zero and return an error message instead of raising an exception.
The function first checks if the requested operation is valid. If not, it returns an error message listing the supported operations. Then it executes the operation inside a try-except block to catch any unexpected errors. If the operation succeeds, it returns a formatted string with the result.
Notice that this is an async function even though it does not use await. This is fine in Python. Making it async allows it to be called from other async contexts and keeps our code consistent. If we later need to add asynchronous operations like database queries, we can do so without changing the function signature.
Step Six: Implementing the Web Search Tool Logic
Add the web search tool implementation:
async def web_search_tool(query: str, max_results: int = 5) -> str:
"""
Simulate a web search.
In a real implementation, you would call an actual search API.
"""
mock_results = [
{
"title": f"Result {i+1} for '{query}'",
"url": f"https://example.com/result{i+1}",
"snippet": f"This is a snippet of information related to {query}..."
}
for i in range(min(max_results, 5))
]
formatted_results = []
for idx, result in enumerate(mock_results, 1):
formatted_results.append(
f"{idx}. {result['title']}\n"
f" URL: {result['url']}\n"
f" {result['snippet']}\n"
)
return "\n".join(formatted_results)
This is a mock implementation that simulates web search results. In a real production server, you would integrate with an actual search API like Google Custom Search API, Bing Search API, or a specialized search service. The mock implementation generates fake results that demonstrate the format and structure of what a real search would return.
The function creates a list of mock results, each containing a title, URL, and snippet. It then formats these results into a readable text format with numbered entries. This formatting makes it easy for the AI to parse and present the information to users.
Step Seven: Implementing the File Read Tool Logic
Add the file reading tool implementation:
async def read_file_tool(path: str) -> str:
"""Read a file from the filesystem"""
try:
with open(path, 'r', encoding='utf-8') as f:
content = f.read()
return f"File contents of '{path}':\n\n{content}"
except FileNotFoundError:
return f"Error: File not found at path '{path}'"
except PermissionError:
return f"Error: Permission denied when trying to read '{path}'"
except Exception as e:
return f"Error reading file: {str(e)}"
This function reads a file from the local filesystem and returns its contents. It includes comprehensive error handling for common issues. If the file does not exist, it returns a FileNotFoundError message. If the process does not have permission to read the file, it returns a PermissionError message. For any other unexpected errors, it returns a generic error message.
The function uses a context manager (the with statement) to ensure the file is properly closed after reading, even if an error occurs. It specifies UTF-8 encoding explicitly, which is a good practice for handling text files that might contain international characters.
Step Eight: Implementing the Call Tool Handler
Now we need to implement the handler that routes tool execution requests to the appropriate tool function. Add this code:
@server.call_tool()
async def call_tool(name: str, arguments: dict[str, Any]) -> Sequence[TextContent]:
"""
Handle tool execution requests.
This is called when the AI wants to use one of our tools.
"""
try:
if name == "calculator":
input_data = CalculatorInput(**arguments)
result = await calculator_tool(
input_data.operation,
input_data.a,
input_data.b
)
elif name == "web_search":
input_data = WebSearchInput(**arguments)
result = await web_search_tool(
input_data.query,
input_data.max_results
)
elif name == "read_file":
input_data = FileReadInput(**arguments)
result = await read_file_tool(input_data.path)
else:
result = f"Error: Unknown tool '{name}'"
return [TextContent(type="text", text=result)]
except Exception as e:
error_message = f"Error executing tool '{name}': {str(e)}"
return [TextContent(type="text", text=error_message)]
The @server.call_tool() decorator registers this function as the handler for tool execution requests. When the AI decides to use one of our tools, this function is called with the tool name and arguments.
The function uses a series of if-elif statements to route the request to the appropriate tool. For each tool, it first validates the arguments by creating an instance of the corresponding Pydantic model. If the arguments are invalid, Pydantic will raise a validation error that gets caught by the outer exception handler.
After validation, the function calls the appropriate tool function with the validated parameters. It awaits the result since our tool functions are async. Finally, it wraps the result in a TextContent object and returns it as a sequence. The MCP protocol requires results to be returned as a sequence of content objects because some tools might return multiple pieces of content, such as both text and images.
The outer try-except block catches any errors that occur during tool execution and returns them as error messages. This ensures that errors are communicated back to the client rather than crashing the server.
Step Nine: Creating the Main Function
Add the main function that starts the server:
async def main():
"""Main entry point for the server"""
async with stdio_server() as (read_stream, write_stream):
await server.run(
read_stream,
write_stream,
server.create_initialization_options()
)
This function sets up the server to communicate over standard input and output streams. The stdio_server() function is an async context manager that creates read and write streams connected to stdin and stdout. The async with statement ensures these streams are properly cleaned up when the server shuts down.
Inside the context manager, we call server.run() with the read and write streams and initialization options. This starts the main server loop, which listens for incoming messages on the read stream, processes them, and sends responses on the write stream. The server will continue running until it receives a shutdown signal or the streams are closed.
Step Ten: Adding the Entry Point
Finally, add the standard Python entry point at the bottom of the file:
if __name__ == "__main__":
asyncio.run(main())
This code checks if the script is being run directly (rather than imported as a module). If so, it uses asyncio.run() to start the async event loop and run our main function. This is the standard pattern for async Python applications.
Testing Your Server
Now that we have a complete server implementation, we can test it using the MCP Inspector tool. The Inspector is an official tool provided by the MCP project that gives you a visual interface for interacting with MCP servers.
To run the Inspector with your server, open a terminal in your mcp-server directory and run the command npx @modelcontextprotocol/inspector uv run server.py. This command does several things. It downloads and runs the MCP Inspector using npx, which is the Node.js package runner. It tells the Inspector to start your server using the command uv run server.py. The Inspector then opens a web interface in your browser where you can interact with your server.
In the Inspector interface, you should see a list of all three tools we defined: calculator, web_search, and read_file. You can click on any tool to see its input schema, which shows exactly what parameters it expects. You can then fill in values for these parameters and execute the tool to see the response.
Try executing the calculator tool with the operation "add", a value of 15, and b value of 27. You should see a response showing the result of 42. Try the web_search tool with a query like "Model Context Protocol" and max_results of 3. You should see mock search results formatted as a numbered list. Try the read_file tool with the path "server.py" to read the server code itself.
The Inspector is invaluable for debugging because it shows you the exact JSON-RPC messages being sent and received. If something is not working, you can see exactly what the client is sending and what the server is responding with.
Chapter Four: Building a Remote MCP Server with HTTP Transport
Now that we understand how to build a local MCP server using STDIO transport, we can explore building a remote server that uses HTTP transport. Remote servers are essential for team collaboration, cloud deployments, and enterprise scenarios where multiple clients need to access the same capabilities.
Understanding Remote Server Use Cases
Remote MCP servers enable several important scenarios that local servers cannot support. In team collaboration environments, multiple team members can access the same tools and resources without each person needing to set up and maintain their own server instance. When you have a tool that connects to a corporate database or internal API, it makes sense to run one centralized server rather than giving every developer database credentials.
For resource-intensive operations, remote servers allow you to offload heavy computations to powerful server infrastructure. If you have a tool that performs complex data analysis or runs machine learning models, you do not want to burden every developer's laptop with this work. Instead, you run the server on dedicated hardware with sufficient memory and processing power.
In enterprise integration scenarios, remote servers can sit behind corporate firewalls and provide controlled access to internal systems. The server can implement sophisticated authentication and authorization logic, ensuring that only authorized users can access sensitive data or perform privileged operations.
Remote servers also provide better scalability. A single server instance can handle requests from multiple concurrent clients. With proper load balancing, you can run multiple server instances to handle even higher loads. This is impossible with STDIO transport, where each client needs its own server process.
Finally, remote servers simplify maintenance. When you need to update a tool's implementation or fix a bug, you update the server once and all clients immediately benefit from the change. With local servers, you would need to distribute updates to every client and ensure they install them.
Security Considerations for Remote Deployment
When deploying MCP servers remotely, security becomes critically important. With STDIO transport, the server runs on the same machine as the client and communication never leaves that machine. With HTTP transport, requests and responses travel over the network, potentially across the internet, which introduces numerous security concerns.
The first and most fundamental security requirement is to always use HTTPS in production environments. HTTPS encrypts all data in transit using TLS, preventing eavesdropping and man-in-the-middle attacks. Without HTTPS, anyone who can intercept network traffic can see the requests being made, the responses being returned, and potentially any sensitive data they contain.
The second requirement is implementing proper authentication. You need to verify that clients are who they claim to be before allowing them to use your tools. This might involve API keys, OAuth tokens, mutual TLS certificates, or other authentication mechanisms. The choice depends on your security requirements and deployment environment.
The third requirement is input validation. You must rigorously validate all inputs to prevent injection attacks, path traversal attacks, and other exploits. Never trust data coming from clients, even authenticated ones. Use Pydantic models for validation, implement custom validators for complex rules, and sanitize inputs before using them in operations like file access or database queries.
The fourth requirement is rate limiting. Without rate limits, malicious or buggy clients could overwhelm your server with requests, causing denial of service for legitimate users. Implement per-client rate limits that restrict how many requests can be made within a time window.
The fifth requirement is comprehensive audit logging. You need to track who accessed what resources and when. This is essential for security monitoring, compliance requirements, and debugging issues. Log authentication attempts, tool executions, errors, and any suspicious activity.
Finally, implement network segmentation and access controls. Your MCP server should only be able to access the resources it absolutely needs. If it only needs to read from a specific database, do not give it write permissions or access to other databases. Use firewalls and network policies to restrict what the server can communicate with.
Installing Additional Dependencies for Remote Servers
For building a remote MCP server, we need a web framework that can handle HTTP requests and Server-Sent Events. We will use FastAPI, which is a modern Python web framework with excellent async support and automatic API documentation generation.
In your terminal, navigate to your mcp-server directory and run the command uv add fastapi uvicorn sse-starlette python-multipart. This installs several packages. FastAPI is the web framework itself, providing routing, request handling, and response generation. Uvicorn is an ASGI server that runs FastAPI applications. ASGI stands for Asynchronous Server Gateway Interface, and it is the async equivalent of WSGI. The sse-starlette package provides Server-Sent Events support, which MCP uses for real-time bidirectional communication. The python-multipart package handles file uploads and form data, which might be useful for certain tools.
Creating the Remote Server File
Create a new file called server_remote.py in your mcp-server directory. This will contain our HTTP-based server implementation. The structure will be similar to our local server, but with additional code for HTTP handling, authentication, and SSE communication.
Implementing the Remote Server
Open server_remote.py and start with the imports:
import asyncio
import json
import logging
from typing import Any, Sequence, Optional
from contextlib import asynccontextmanager
from fastapi import FastAPI, Request, HTTPException, Header, Depends
from fastapi.responses import StreamingResponse
from sse_starlette.sse import EventSourceResponse
from mcp.server import Server
from mcp.types import (
Tool,
TextContent,
Resource,
)
from pydantic import BaseModel, Field, validator
import httpx
These imports include everything from our local server plus FastAPI components. The FastAPI class creates the web application. Request represents incoming HTTP requests. HTTPException is used to return HTTP error responses. Header and Depends are used for dependency injection, which we will use for authentication. EventSourceResponse handles Server-Sent Events.
Now add logging configuration:
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)
This configures Python's logging system to output informational messages with timestamps, logger names, and log levels. Proper logging is essential for production servers because it helps you understand what the server is doing and diagnose problems.
Create the MCP server instance:
mcp_server = Server("remote-example-server")
Now implement simple API key authentication. In a production system, you would use OAuth 2.1 or another robust authentication mechanism, but API keys are sufficient for demonstration:
VALID_API_KEYS = {
"dev-key-12345": "development",
"prod-key-67890": "production"
}
async def verify_api_key(x_api_key: Optional[str] = Header(None)) -> str:
"""Verify API key from request header"""
if not x_api_key:
raise HTTPException(status_code=401, detail="API key required")
if x_api_key not in VALID_API_KEYS:
raise HTTPException(status_code=403, detail="Invalid API key")
return VALID_API_KEYS[x_api_key]
This function will be used as a dependency in our route handlers. It extracts the X-API-Key header from incoming requests and validates it against our dictionary of valid keys. If the key is missing, it returns a 401 Unauthorized response. If the key is invalid, it returns a 403 Forbidden response. If the key is valid, it returns the environment name associated with that key.
Now add the Pydantic input models. These are the same as in our local server, with one addition:
class CalculatorInput(BaseModel):
"""Input for the calculator tool"""
operation: str = Field(description="The operation to perform: add, subtract, multiply, divide")
a: float = Field(description="First number")
b: float = Field(description="Second number")
class WebSearchInput(BaseModel):
"""Input for the web search tool"""
query: str = Field(description="The search query")
max_results: int = Field(default=5, description="Maximum number of results to return")
@validator('max_results')
def validate_max_results(cls, v):
"""Limit max results to prevent abuse"""
if v > 20:
raise ValueError("max_results cannot exceed 20")
return v
class DataProcessInput(BaseModel):
"""Input for data processing tool (remote-only example)"""
data: list[float] = Field(description="List of numbers to process")
operation: str = Field(description="Operation: sum, average, min, max")
Notice the additional validator on WebSearchInput that limits max_results to 20. This prevents abuse where someone might request thousands of results, consuming excessive resources. The DataProcessInput model is new and demonstrates a tool that might only make sense on a remote server, where you can process large datasets efficiently.
Now implement the MCP server handlers. The list_tools handler is similar to before but includes our new data processing tool:
@mcp_server.list_tools()
async def list_tools() -> list[Tool]:
"""
List all available tools.
This handler is called when the client wants to know what tools are available.
"""
return [
Tool(
name="calculator",
description="Perform basic arithmetic operations (add, subtract, multiply, divide)",
inputSchema=CalculatorInput.model_json_schema(),
),
Tool(
name="web_search",
description="Search the web for information using a query string",
inputSchema=WebSearchInput.model_json_schema(),
),
Tool(
name="data_process",
description="Process numerical data with various operations (sum, average, min, max)",
inputSchema=DataProcessInput.model_json_schema(),
),
]
Add resource handlers to demonstrate how resources work in a remote server:
@mcp_server.list_resources()
async def list_resources() -> list[Resource]:
"""List available resources"""
return [
Resource(
uri="server://status",
name="Server Status",
description="Current server status and statistics",
mimeType="application/json",
),
Resource(
uri="server://info",
name="Server Information",
description="Server configuration and capabilities",
mimeType="application/json",
),
]
@mcp_server.read_resource()
async def read_resource(uri: str) -> str:
"""Read a resource by URI"""
if uri == "server://status":
import platform
status = {
"status": "running",
"platform": platform.system(),
"python_version": platform.python_version(),
"uptime": "available via monitoring"
}
return json.dumps(status, indent=2)
elif uri == "server://info":
info = {
"server_name": "remote-example-server",
"version": "1.0.0",
"mcp_version": "2025-11-25",
"capabilities": ["tools", "resources"],
"transport": "HTTP/SSE"
}
return json.dumps(info, indent=2)
else:
raise ValueError(f"Unknown resource: {uri}")
Resources provide read-only data that the AI can access. In this example, we provide server status and configuration information. In a real application, you might provide API documentation, configuration files, or other reference data.
Now implement the tool functions. The calculator and web search tools are the same as before, with added logging:
async def calculator_tool(operation: str, a: float, b: float) -> str:
"""Execute calculator operations"""
logger.info(f"Calculator: {operation}({a}, {b})")
operations = {
"add": lambda x, y: x + y,
"subtract": lambda x, y: x - y,
"multiply": lambda x, y: x * y,
"divide": lambda x, y: x / y if y != 0 else "Error: Division by zero",
}
if operation not in operations:
return f"Error: Unknown operation '{operation}'. Supported operations: {', '.join(operations.keys())}"
try:
result = operations[operation](a, b)
if isinstance(result, str):
return result
return f"Result: {a} {operation} {b} = {result}"
except Exception as e:
logger.error(f"Calculator error: {e}")
return f"Error: {str(e)}"
async def web_search_tool(query: str, max_results: int = 5) -> str:
"""
Simulate a web search.
In a real implementation, you would call an actual search API.
"""
logger.info(f"Web search: {query} (max: {max_results})")
mock_results = [
{
"title": f"Result {i+1} for '{query}'",
"url": f"https://example.com/result{i+1}",
"snippet": f"This is a snippet of information related to {query}..."
}
for i in range(min(max_results, 5))
]
formatted_results = []
for idx, result in enumerate(mock_results, 1):
formatted_results.append(
f"{idx}. {result['title']}\n"
f" URL: {result['url']}\n"
f" {result['snippet']}\n"
)
return "\n".join(formatted_results)
Add the new data processing tool:
async def data_process_tool(data: list[float], operation: str) -> str:
"""Process numerical data"""
logger.info(f"Data processing: {operation} on {len(data)} items")
if not data:
return "Error: No data provided"
operations = {
"sum": sum,
"average": lambda x: sum(x) / len(x),
"min": min,
"max": max,
}
if operation not in operations:
return f"Error: Unknown operation '{operation}'. Supported: {', '.join(operations.keys())}"
try:
result = operations[operation](data)
return f"Result of {operation} on {len(data)} values: {result}"
except Exception as e:
logger.error(f"Data processing error: {e}")
return f"Error: {str(e)}"
This tool processes lists of numbers with various operations. It demonstrates how remote servers can handle more complex data structures efficiently.
Now implement the call_tool handler with all three tools:
@mcp_server.call_tool()
async def call_tool(name: str, arguments: dict[str, Any]) -> Sequence[TextContent]:
"""
Handle tool execution requests.
This is called when the AI wants to use one of our tools.
"""
logger.info(f"Tool called: {name}")
try:
if name == "calculator":
input_data = CalculatorInput(**arguments)
result = await calculator_tool(
input_data.operation,
input_data.a,
input_data.b
)
elif name == "web_search":
input_data = WebSearchInput(**arguments)
result = await web_search_tool(
input_data.query,
input_data.max_results
)
elif name == "data_process":
input_data = DataProcessInput(**arguments)
result = await data_process_tool(
input_data.data,
input_data.operation
)
else:
result = f"Error: Unknown tool '{name}'"
return [TextContent(type="text", text=result)]
except Exception as e:
logger.error(f"Error executing tool '{name}': {e}", exc_info=True)
error_message = f"Error executing tool '{name}': {str(e)}"
return [TextContent(type="text", text=error_message)]
Now we create the FastAPI application. First, add a lifespan context manager for startup and shutdown logic:
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Lifespan context manager for startup/shutdown"""
logger.info("Starting MCP Remote Server")
yield
logger.info("Shutting down MCP Remote Server")
app = FastAPI(
title="MCP Remote Server",
description="Model Context Protocol server accessible via HTTP/SSE",
version="1.0.0",
lifespan=lifespan
)
The lifespan context manager runs when the application starts and stops. You can use it to initialize database connections, load configuration, or perform other setup tasks.
Add a root endpoint that provides server information:
@app.get("/")
async def root():
"""Root endpoint with server information"""
return {
"name": "MCP Remote Server",
"version": "1.0.0",
"mcp_version": "2025-11-25",
"transport": "HTTP/SSE",
"endpoints": {
"sse": "/sse",
"health": "/health"
}
}
Add a health check endpoint that requires authentication:
@app.get("/health")
async def health_check(env: str = Depends(verify_api_key)):
"""Health check endpoint"""
return {
"status": "healthy",
"environment": env
}
The Depends(verify_api_key) parameter tells FastAPI to call our authentication function before executing the handler. If authentication fails, the handler is never called.
Add the SSE endpoint for real-time communication:
@app.get("/sse")
async def sse_endpoint(
request: Request,
env: str = Depends(verify_api_key)
):
"""
Server-Sent Events endpoint for MCP communication.
This is where the MCP client connects to communicate with the server.
"""
logger.info(f"SSE connection established from {env} environment")
async def event_generator():
"""Generate SSE events for MCP communication"""
try:
yield {
"event": "message",
"data": json.dumps({
"jsonrpc": "2.0",
"method": "notifications/initialized",
"params": {}
})
}
while True:
if await request.is_disconnected():
logger.info("Client disconnected")
break
await asyncio.sleep(1)
except Exception as e:
logger.error(f"SSE error: {e}", exc_info=True)
return EventSourceResponse(event_generator())
This endpoint maintains a persistent connection with the client using Server-Sent Events. The event_generator function yields events that are sent to the client. In a full implementation, this would handle bidirectional communication, but this simplified version demonstrates the basic structure.
Finally, add the message handling endpoint:
@app.post("/message")
async def handle_message(
request: Request,
env: str = Depends(verify_api_key)
):
"""
Handle JSON-RPC messages from MCP clients.
This endpoint receives tool calls and other MCP requests.
"""
try:
message = await request.json()
logger.info(f"Received message: {message.get('method', 'unknown')}")
method = message.get("method")
params = message.get("params", {})
msg_id = message.get("id")
if method == "tools/list":
tools = await list_tools()
response = {
"jsonrpc": "2.0",
"id": msg_id,
"result": {
"tools": [
{
"name": tool.name,
"description": tool.description,
"inputSchema": tool.inputSchema
}
for tool in tools
]
}
}
elif method == "tools/call":
tool_name = params.get("name")
arguments = params.get("arguments", {})
result = await call_tool(tool_name, arguments)
response = {
"jsonrpc": "2.0",
"id": msg_id,
"result": {
"content": [
{
"type": content.type,
"text": content.text
}
for content in result
]
}
}
elif method == "resources/list":
resources = await list_resources()
response = {
"jsonrpc": "2.0",
"id": msg_id,
"result": {
"resources": [
{
"uri": resource.uri,
"name": resource.name,
"description": resource.description,
"mimeType": resource.mimeType
}
for resource in resources
]
}
}
elif method == "resources/read":
uri = params.get("uri")
content = await read_resource(uri)
response = {
"jsonrpc": "2.0",
"id": msg_id,
"result": {
"contents": [
{
"uri": uri,
"mimeType": "application/json",
"text": content
}
]
}
}
else:
response = {
"jsonrpc": "2.0",
"id": msg_id,
"error": {
"code": -32601,
"message": f"Method not found: {method}"
}
}
return response
except Exception as e:
logger.error(f"Error handling message: {e}", exc_info=True)
return {
"jsonrpc": "2.0",
"id": message.get("id") if 'message' in locals() else None,
"error": {
"code": -32603,
"message": f"Internal error: {str(e)}"
}
}
This endpoint handles JSON-RPC messages from clients. It routes requests to the appropriate MCP server handlers and formats the responses according to the JSON-RPC specification.
Add the entry point to run the server:
if __name__ == "__main__":
import uvicorn
uvicorn.run(
app,
host="0.0.0.0",
port=8000,
log_level="info"
)
This starts the Uvicorn server listening on all network interfaces on port 8000.
Running and Testing the Remote Server
To start your remote server, run the command uv run server_remote.py in your terminal. You should see log messages indicating that the server is starting and listening on port 8000. The server is now accessible at http://localhost:8000 from your local machine.
You can test the server using curl commands. First, try the health check endpoint with the command curl -H "X-API-Key: dev-key-12345" http://localhost:8000/health. You should receive a JSON response indicating the server is healthy and showing the environment associated with your API key.
To list available tools, use the command curl -X POST http://localhost:8000/message -H "Content-Type: application/json" -H "X-API-Key: dev-key-12345" -d '{"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}}'. This sends a JSON-RPC request to list tools. You should receive a response containing all three tools with their descriptions and input schemas.
To call the calculator tool, use the command curl -X POST http://localhost:8000/message -H "Content-Type: application/json" -H "X-API-Key: dev-key-12345" -d '{"jsonrpc": "2.0", "id": 2, "method": "tools/call", "params": {"name": "calculator", "arguments": {"operation": "add", "a": 15, "b": 27}}}'. You should receive a response containing the calculation result.
These curl commands demonstrate that your server is working correctly and can handle authenticated requests. In the next chapter, we will build a Python client that can interact with this server programmatically.
Chapter Five: Building an MCP Client for Remote Servers
Now that we have a working remote server, we need to build a client that can communicate with it over HTTP. While MCP hosts like Claude Desktop have built-in clients, understanding how to build your own client is valuable for testing, automation, and integrating MCP capabilities into custom applications.
Understanding Client Responsibilities
An MCP client has several important responsibilities. First, it must establish and maintain a connection with the server. For STDIO transport, this means launching the server process and managing its lifecycle. For HTTP transport, this means making HTTP requests and potentially maintaining persistent connections for Server-Sent Events.
Second, the client must handle the JSON-RPC protocol correctly. Every request must have a unique identifier, the correct method name, and properly formatted parameters. Every response must be parsed to extract either the result or error information.
Third, the client must manage authentication. For remote servers, this typically means including authentication headers with every request.
The client needs to store credentials securely and handle authentication failures gracefully.
Fourth, the client should handle errors robustly. Network requests can fail for many reasons including timeouts, connection errors, and server errors. The client needs to detect these failures, potentially retry requests, and provide meaningful error messages to the application using the client.
Finally, the client should provide a clean, easy-to-use interface for the application. Instead of requiring the application to construct JSON-RPC messages manually, the client should expose simple methods like list_tools() and call_tool() that handle all the protocol details internally.
Creating the Remote Client File
Create a new file called client_remote.py in your mcp-server directory. This file will contain a class that encapsulates all the client functionality, making it easy to use from other Python code.
Implementing the Remote Client Class
Start by adding the necessary imports:
import asyncio
import json
import httpx
from typing import Any, Dict, Optional
The httpx library provides an async HTTP client that we will use to communicate with the server. The typing module provides type hints for better code documentation and IDE support.
Now create the RemoteMCPClient class:
class RemoteMCPClient:
"""
Client for connecting to remote MCP servers via HTTP/SSE.
"""
def __init__(self, base_url: str, api_key: str):
"""
Initialize the remote MCP client.
Args:
base_url: Base URL of the MCP server (e.g., http://localhost:8000)
api_key: API key for authentication
"""
self.base_url = base_url.rstrip('/')
self.api_key = api_key
self.headers = {
"X-API-Key": api_key,
"Content-Type": "application/json"
}
self.message_id = 0
The constructor takes the server's base URL and an API key for authentication. It stores these values and prepares headers that will be included with every request. The message_id counter will be used to generate unique identifiers for JSON-RPC requests.
Add a helper method to generate message identifiers:
def _get_next_id(self) -> int:
"""Get next message ID"""
self.message_id += 1
return self.message_id
This method increments the counter and returns the next identifier. Each JSON-RPC request needs a unique identifier so that responses can be matched to their corresponding requests.
Add the core method for sending requests:
async def _send_request(self, method: str, params: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
"""
Send a JSON-RPC request to the server.
Args:
method: The MCP method to call
params: Parameters for the method
Returns:
The response from the server
"""
request_data = {
"jsonrpc": "2.0",
"id": self._get_next_id(),
"method": method,
"params": params or {}
}
async with httpx.AsyncClient() as client:
response = await client.post(
f"{self.base_url}/message",
json=request_data,
headers=self.headers,
timeout=30.0
)
response.raise_for_status()
return response.json()
This method constructs a JSON-RPC request with the specified method and parameters, sends it to the server's message endpoint, and returns the parsed response. The async with statement ensures the HTTP client is properly closed after the request completes. The timeout parameter prevents requests from hanging indefinitely. The raise_for_status() call raises an exception if the server returns an HTTP error status.
Add a method for health checks:
async def health_check(self) -> Dict[str, Any]:
"""Check server health"""
async with httpx.AsyncClient() as client:
response = await client.get(
f"{self.base_url}/health",
headers=self.headers
)
response.raise_for_status()
return response.json()
This method calls the server's health endpoint to verify it is running and responsive. This is useful for monitoring and for verifying connectivity before attempting more complex operations.
Add a method to list available tools:
async def list_tools(self) -> list[Dict[str, Any]]:
"""List available tools from the server"""
response = await self._send_request("tools/list")
if "error" in response:
raise Exception(f"Error listing tools: {response['error']}")
return response.get("result", {}).get("tools", [])
This method sends a tools/list request to the server and extracts the tools array from the response. If the response contains an error, it raises an exception with the error details.
Add a method to call tools:
async def call_tool(self, name: str, arguments: Dict[str, Any]) -> str:
"""
Call a tool on the server.
Args:
name: Name of the tool to call
arguments: Arguments for the tool
Returns:
The text result from the tool
"""
response = await self._send_request(
"tools/call",
{
"name": name,
"arguments": arguments
}
)
if "error" in response:
raise Exception(f"Error calling tool: {response['error']}")
content = response.get("result", {}).get("content", [])
if content and len(content) > 0:
return content[0].get("text", "")
return ""
This method sends a tools/call request with the tool name and arguments. It extracts the text content from the first content item in the response. This simplifies the interface for callers who just want the text result without dealing with the content array structure.
Add a method to list resources:
async def list_resources(self) -> list[Dict[str, Any]]:
"""List available resources from the server"""
response = await self._send_request("resources/list")
if "error" in response:
raise Exception(f"Error listing resources: {response['error']}")
return response.get("result", {}).get("resources", [])
This method lists all resources available on the server, similar to how list_tools works for tools.
Add a method to read resources:
async def read_resource(self, uri: str) -> str:
"""
Read a resource from the server.
Args:
uri: URI of the resource to read
Returns:
The resource content
"""
response = await self._send_request(
"resources/read",
{"uri": uri}
)
if "error" in response:
raise Exception(f"Error reading resource: {response['error']}")
contents = response.get("result", {}).get("contents", [])
if contents and len(contents) > 0:
return contents[0].get("text", "")
return ""
This method reads a specific resource by its URI and returns its content.
Creating a Demonstration Program
Now add a main function that demonstrates using the client:
async def main():
"""
Main client function demonstrating remote MCP server interaction.
"""
client = RemoteMCPClient(
base_url="http://localhost:8000",
api_key="dev-key-12345"
)
try:
print("=== Health Check ===")
health = await client.health_check()
print(f"Server status: {health}")
print()
print("=== Available Tools ===")
tools = await client.list_tools()
for tool in tools:
print(f" - {tool['name']}: {tool['description']}")
print()
print("=== Testing Calculator Tool (15 + 27) ===")
calc_result = await client.call_tool(
"calculator",
{
"operation": "add",
"a": 15,
"b": 27
}
)
print(f"Result: {calc_result}")
print()
print("=== Testing Web Search Tool ===")
search_result = await client.call_tool(
"web_search",
{
"query": "Model Context Protocol",
"max_results": 3
}
)
print(f"Results:\n{search_result}")
print()
print("=== Testing Data Processing Tool ===")
data_result = await client.call_tool(
"data_process",
{
"data": [10, 20, 30, 40, 50],
"operation": "average"
}
)
print(f"Result: {data_result}")
print()
print("=== Available Resources ===")
resources = await client.list_resources()
for resource in resources:
print(f" - {resource['name']} ({resource['uri']})")
print()
print("=== Server Status Resource ===")
status = await client.read_resource("server://status")
print(status)
print()
print("=== Server Info Resource ===")
info = await client.read_resource("server://info")
print(info)
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
asyncio.run(main())
This main function creates a client instance, performs a health check, lists all available tools and resources, executes each tool with sample data, and reads the available resources. It demonstrates all the major client capabilities in a simple, readable format.
Running the Client
Before running the client, make sure your remote server is running. In one terminal window, start the server with the command uv run server_remote.py. You should see log messages indicating the server has started.
In a second terminal window, run the client with the command uv run client_remote.py. You should see output showing the health check result, the list of available tools, the results of calling each tool, the list of resources, and the contents of each resource.
The client demonstrates how easy it is to interact with a remote MCP server once you have a proper client library. The application code does not need to worry about JSON-RPC formatting, HTTP requests, or error handling because the client class handles all these details.
Chapter Six: Integrating with Claude Desktop
Claude Desktop is one of the most popular MCP hosts, and integrating your server with it allows you to use your custom tools directly in conversations with Claude. This chapter explains how to configure Claude Desktop to connect to both local and remote MCP servers.
Understanding Claude Desktop Configuration
Claude Desktop uses a JSON configuration file to specify which MCP servers to connect to. The configuration file location depends on your operating system. On macOS, the file is located at ~/Library/Application Support/Claude/claude_desktop_config.json. On Windows, it is at %APPDATA%\Claude\claude_desktop_config.json. On Linux, it is at ~/.config/Claude/claude_desktop_config.json.
If this file does not exist, you need to create it. The file contains a JSON object with an mcpServers property, which is itself an object where each key is a server name and each value is the server configuration.
Configuring a Local Server with STDIO Transport
For a local server using STDIO transport, the configuration specifies the command to run the server and any arguments it needs. Create or edit your claude_desktop_config.json file and add the following configuration:
{
"mcpServers": {
"example-local": {
"command": "uv",
"args": [
"run",
"--directory",
"/absolute/path/to/mcp-server",
"server.py"
]
}
}
}
Replace /absolute/path/to/mcp-server with the actual absolute path to your mcp-server directory. You must use an absolute path, not a relative path, because Claude Desktop does not know what directory to resolve relative paths from.
The command property specifies the executable to run, which in this case is uv. The args array contains the arguments to pass to that executable. The run argument tells uv to run a Python script. The --directory argument specifies which directory contains the project. The final argument server.py is the script to run.
When Claude Desktop starts, it will launch this server as a subprocess. The server will remain running as long as Claude Desktop is running. When you close Claude Desktop, the server process is terminated automatically.
Configuring a Remote Server with HTTP Transport
For a remote server using HTTP transport, the configuration specifies the URL to connect to and any headers needed for authentication.
Add this configuration to your claude_desktop_config.json file:
{
"mcpServers": {
"example-remote": {
"url": "http://localhost:8000/sse",
"transport": "sse",
"headers": {
"X-API-Key": "dev-key-12345"
}
}
}
}
The url property specifies the full URL of the server's SSE endpoint. The transport property indicates that this server uses Server-Sent Events for communication. The headers object contains HTTP headers that will be included with every request. In this example, we include the X-API-Key header for authentication.
For a production server using HTTPS, the configuration would look like this:
{
"mcpServers": {
"production-server": {
"url": "https://mcp.yourcompany.com/sse",
"transport": "sse",
"headers": {
"X-API-Key": "your-production-api-key",
"X-Environment": "production"
}
}
}
}
Notice the use of HTTPS instead of HTTP, which encrypts all communication between Claude Desktop and the server. You can include multiple headers if your server requires them for authentication, authorization, or other purposes.
Configuring Multiple Servers
Claude Desktop can connect to multiple MCP servers simultaneously. Each server's tools and resources will be available to Claude, and it will choose which server to use based on the task at hand. Here is an example configuration with multiple servers:
{
"mcpServers": {
"local-tools": {
"command": "uv",
"args": ["run", "--directory", "/path/to/local-server", "server.py"]
},
"remote-database": {
"url": "https://db-mcp.company.com/sse",
"transport": "sse",
"headers": {
"X-API-Key": "db-api-key"
}
},
"remote-analytics": {
"url": "https://analytics-mcp.company.com/sse",
"transport": "sse",
"headers": {
"X-API-Key": "analytics-api-key"
}
}
}
}
This configuration includes one local server and two remote servers. The local-tools server runs on your machine and might provide file access or local system tools. The remote-database server connects to a company database server and provides tools for querying data. The remote-analytics server connects to an analytics platform and provides tools for generating reports and visualizations.
When you have multiple servers configured, Claude will see all the tools from all servers and can use any of them as needed. This composability is one of the key benefits of the MCP architecture.
Restarting Claude Desktop
After modifying the configuration file, you must restart Claude Desktop for the changes to take effect. Close Claude Desktop completely, making sure it is not still running in the background. Then start Claude Desktop again.
When Claude Desktop starts, it will read the configuration file and attempt to connect to all configured servers. If a server fails to start or connect, Claude Desktop will log an error, but it will continue to function with the servers that did connect successfully.
Verifying the Integration
To verify that your server is connected and working, start a conversation with Claude and ask it to use one of your tools. For example, you might say "Please calculate 15 plus 27 using the calculator tool." If everything is configured correctly, Claude will recognize that it has access to a calculator tool, call it with the appropriate parameters, and include the result in its response.
You can also ask Claude "What tools do you have access to?" and it should list all the tools from all connected MCP servers, including your custom tools. This is a quick way to verify that the connection is working.
If Claude does not seem to have access to your tools, check the Claude Desktop logs for error messages. On macOS, you can view logs by opening Console.app and filtering for Claude. On Windows, logs are typically in the application data directory. The logs will show any errors that occurred while trying to connect to your servers.
Understanding Tool Selection
When you ask Claude to perform a task, it analyzes your request and determines which tools, if any, would be helpful. It considers the tool descriptions you provided in your list_tools handler. This is why clear, descriptive tool descriptions are important. They help Claude understand when each tool is appropriate.
Claude does not automatically use tools for every request. It only uses them when it determines they would provide value. For example, if you ask "What is 2 plus 2?" Claude will likely answer directly without using a calculator tool because this is a simple calculation it can perform mentally. But if you ask "What is the square root of 987654321?" it might use a calculator tool because this is a complex calculation.
You can explicitly request that Claude use a specific tool by mentioning it by name in your request. For example, "Use the calculator tool to add 15 and 27" will ensure Claude uses your tool rather than calculating the answer itself.
Chapter Seven: Production Deployment Strategies
Deploying MCP servers to production requires careful consideration of infrastructure, security, monitoring, and scalability. This chapter covers various deployment strategies and best practices for running MCP servers in production environments.
Deployment with Docker
Docker provides a consistent, reproducible environment for running applications. It packages your server code along with all its dependencies into a container image that can run anywhere Docker is supported.
Create a file named Dockerfile in your mcp-server directory with the following content:
FROM python:3.11-slim
WORKDIR /app
RUN pip install uv
COPY pyproject.toml .
COPY server_remote.py .
RUN uv sync
EXPOSE 8000
CMD ["uv", "run", "uvicorn", "server_remote:app", "--host", "0.0.0.0", "--port", "8000"]
This Dockerfile starts from a minimal Python 3.11 image, which keeps the container size small. It sets the working directory to /app, which is where all application files will be located. It installs the uv package manager using pip. It copies the project configuration and server code into the container. It runs uv sync to install all dependencies specified in pyproject.toml. It exposes port 8000, which is where the server will listen. Finally, it specifies the command to run when the container starts.
Create a docker-compose.yml file to make it easier to run the container:
version: '3.8'
services:
mcp-server:
build: .
ports:
- "8000:8000"
environment:
- LOG_LEVEL=info
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "-H", "X-API-Key: dev-key-12345", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
This Docker Compose configuration defines a service named mcp-server that builds from the Dockerfile in the current directory. It maps port 8000 from the container to port 8000 on the host, making the server accessible. It sets an environment variable for the log level. It configures the container to restart automatically unless explicitly stopped. It includes a health check that calls the server's health endpoint every 30 seconds to verify it is running correctly.
To build and run the container, execute the command docker-compose up -d. The -d flag runs the container in detached mode, meaning it runs in the background. You can view the logs with docker-compose logs -f. You can stop the container with docker-compose down.
Deployment with Kubernetes
Kubernetes provides orchestration for containerized applications, including features like automatic scaling, rolling updates, and self-healing. For production deployments that need high availability and scalability, Kubernetes is an excellent choice.
Create a file named k8s-deployment.yaml with the following content:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-server
labels:
app: mcp-server
spec:
replicas: 3
selector:
matchLabels:
app: mcp-server
template:
metadata:
labels:
app: mcp-server
spec:
containers:
- name: mcp-server
image: your-registry/mcp-server:latest
ports:
- containerPort: 8000
env:
- name: LOG_LEVEL
value: "info"
livenessProbe:
httpGet:
path: /health
port: 8000
httpHeaders:
- name: X-API-Key
value: "dev-key-12345"
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8000
httpHeaders:
- name: X-API-Key
value: "dev-key-12345"
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: mcp-server-service
spec:
selector:
app: mcp-server
ports:
- protocol: TCP
port: 80
targetPort: 8000
type: LoadBalancer
This Kubernetes configuration creates a Deployment with three replicas of your MCP server, providing redundancy and load distribution. Each pod runs a container from your server image. The configuration includes liveness and readiness probes that check the health endpoint to ensure pods are functioning correctly. If a pod fails the liveness probe, Kubernetes will restart it. If it fails the readiness probe, Kubernetes will stop sending traffic to it until it recovers.
The Service configuration creates a load balancer that distributes incoming traffic across all healthy pods. The service listens on port 80 and forwards traffic to port 8000 on the pods.
To deploy this configuration, first build your Docker image and push it to a container registry. Then apply the Kubernetes configuration with the command kubectl apply -f k8s-deployment.yaml. You can check the status of your deployment with kubectl get deployments and kubectl get pods. You can view logs from all pods with kubectl logs -l app=mcp-server.
Deployment to Cloud Platforms
Major cloud providers offer managed services that simplify deployment and operation of web applications. These services handle infrastructure management, scaling, and monitoring, allowing you to focus on your application code.
For AWS Elastic Beanstalk, create a file named Procfile with the following content:
web: uvicorn server_remote:app --host 0.0.0.0 --port 8000
Then deploy with the commands eb init -p python-3.11 mcp-server followed by eb create mcp-server-env and eb deploy. Elastic Beanstalk will automatically provision the necessary infrastructure, deploy your application, and provide a URL where it can be accessed.
For Google Cloud Run, you can deploy directly from your source code with the command gcloud run deploy mcp-server --source . --platform managed --region us-central1 --allow-unauthenticated. Cloud Run will build a container from your code, deploy it, and provide a URL. Cloud Run automatically scales based on incoming traffic, even scaling to zero when there are no requests.
For Azure Container Instances, first build and push your Docker image to Azure Container Registry. Then create a container instance with the command az container create --resource-group myResourceGroup --name mcp-server --image your-registry/mcp-server:latest --dns-name-label mcp-server --ports 8000. Azure will provision the container and provide a fully qualified domain name where it can be accessed.
Adding HTTPS with Nginx Reverse Proxy
In production, you should always use HTTPS to encrypt communication between clients and your server. One common approach is to run an Nginx reverse proxy in front of your MCP server. Nginx handles TLS termination and forwards requests to your server over HTTP on the local network.
Create a file named nginx.conf with the following content:
server {
listen 80;
server_name mcp.yourcompany.com;
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name mcp.yourcompany.com;
ssl_certificate /etc/ssl/certs/your-cert.pem;
ssl_certificate_key /etc/ssl/private/your-key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Connection '';
proxy_http_version 1.1;
chunked_transfer_encoding off;
proxy_buffering off;
proxy_cache off;
}
}
This configuration sets up two server blocks. The first listens on port 80 for HTTP requests and redirects them to HTTPS, ensuring all communication is encrypted. The second listens on port 443 for HTTPS requests, handles TLS termination using your certificate and private key, and forwards requests to your MCP server running on localhost:8000.
The proxy settings are important for Server-Sent Events to work correctly. The Connection header is set to empty to prevent Nginx from adding its own connection management. HTTP version 1.1 is required for SSE. Chunked transfer encoding, buffering, and caching are disabled to ensure events are delivered immediately.
You can obtain free TLS certificates from Let's Encrypt using Certbot. Install Certbot for your operating system, then run certbot --nginx -d mcp.yourcompany.com to automatically obtain a certificate and configure Nginx to use it.
Environment-Specific Configuration
Production servers should use environment variables for configuration rather than hardcoding values. This allows you to use different settings in development, staging, and production environments without changing code.
Modify your server_remote.py to read configuration from environment variables:
import os
API_KEYS = {}
for key, value in os.environ.items():
if key.startswith("API_KEY_"):
env_name = key.replace("API_KEY_", "").lower()
API_KEYS[value] = env_name
LOG_LEVEL = os.environ.get("LOG_LEVEL", "info")
HOST = os.environ.get("HOST", "0.0.0.0")
PORT = int(os.environ.get("PORT", "8000"))
logging.basicConfig(
level=getattr(logging, LOG_LEVEL.upper()),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
Then in your deployment environment, set these variables. For Docker, you can use an environment file. Create a file named .env:
API_KEY_DEV=dev-key-12345
API_KEY_PROD=prod-key-67890
LOG_LEVEL=info
HOST=0.0.0.0
PORT=8000
Update your docker-compose.yml to use this file:
version: '3.8'
services:
mcp-server:
build: .
ports:
- "8000:8000"
env_file:
- .env
restart: unless-stopped
For Kubernetes, create a ConfigMap and Secret to store configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: mcp-server-config
data:
LOG_LEVEL: "info"
HOST: "0.0.0.0"
PORT: "8000"
---
apiVersion: v1
kind: Secret
metadata:
name: mcp-server-secrets
type: Opaque
stringData:
API_KEY_DEV: "dev-key-12345"
API_KEY_PROD: "prod-key-67890"
Reference these in your Deployment:
spec:
containers:
- name: mcp-server
image: your-registry/mcp-server:latest
envFrom:
- configMapRef:
name: mcp-server-config
- secretRef:
name: mcp-server-secrets
This approach keeps sensitive information like API keys out of your code and container images, improving security.
Monitoring and Observability
Production servers need comprehensive monitoring to detect and diagnose issues. At minimum, you should monitor server health, request rates, error rates, and response times.
Add Prometheus metrics to your server by installing the prometheus-fastapi-instrumentator package with uv add prometheus-fastapi-instrumentator. Then add this code to server_remote.py:
from prometheus_fastapi_instrumentator import Instrumentator
instrumentator = Instrumentator()
instrumentator.instrument(app).expose(app)
This automatically adds metrics endpoints and instruments your FastAPI application to track request counts, durations, and other useful metrics. The metrics are exposed at the /metrics endpoint in Prometheus format.
Add structured logging to make logs easier to parse and analyze. Install python-json-logger with uv add python-json-logger and configure it:
from pythonjsonlogger import jsonlogger
logHandler = logging.StreamHandler()
formatter = jsonlogger.JsonFormatter()
logHandler.setFormatter(formatter)
logger.addHandler(logHandler)
This outputs logs in JSON format, which can be easily ingested by log aggregation systems like Elasticsearch, Splunk, or CloudWatch Logs.
Add distributed tracing to track requests across multiple services. Install OpenTelemetry with uv add opentelemetry-api opentelemetry-sdk
opentelemetry-instrumentation-fastapi and configure it:
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
trace.set_tracer_provider(TracerProvider())
trace.get_tracer_provider().add_span_processor(
BatchSpanProcessor(ConsoleSpanExporter())
)
FastAPIInstrumentor.instrument_app(app)
This creates traces for all incoming requests, allowing you to see exactly how long each operation takes and where time is being spent.
Chapter Eight: Advanced Security Practices
Security is paramount when deploying MCP servers, especially when they are accessible over the network. This chapter covers advanced security practices that go beyond basic authentication.
Implementing OAuth 2.1 Authentication
While API keys are simple, OAuth 2.1 provides more robust authentication with features like token expiration, refresh tokens, and fine-grained scopes. Install the required packages with uv add python-jose passlib python-multipart.
Create a new file named auth.py:
from datetime import datetime, timedelta
from typing import Optional
from jose import JWTError, jwt
from passlib.context import CryptContext
from fastapi import Depends, HTTPException, status
from fastapi.security import OAuth2PasswordBearer
SECRET_KEY = "your-secret-key-here-change-in-production"
ALGORITHM = "HS256"
ACCESS_TOKEN_EXPIRE_MINUTES = 30
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="token")
def verify_password(plain_password: str, hashed_password: str) -> bool:
"""Verify a password against its hash"""
return pwd_context.verify(plain_password, hashed_password)
def get_password_hash(password: str) -> str:
"""Hash a password"""
return pwd_context.hash(password)
def create_access_token(data: dict, expires_delta: Optional[timedelta] = None) -> str:
"""Create a JWT access token"""
to_encode = data.copy()
if expires_delta:
expire = datetime.utcnow() + expires_delta
else:
expire = datetime.utcnow() + timedelta(minutes=15)
to_encode.update({"exp": expire})
encoded_jwt = jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)
return encoded_jwt
async def get_current_user(token: str = Depends(oauth2_scheme)) -> dict:
"""Validate token and return current user"""
credentials_exception = HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"},
)
try:
payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
username: str = payload.get("sub")
if username is None:
raise credentials_exception
return {"username": username}
except JWTError:
raise credentials_exception
This module provides functions for password hashing, token creation, and token validation. The create_access_token function creates a
JWT that expires after a specified time. The get_current_user function validates incoming tokens and extracts user information.
Add a token endpoint to server_remote.py:
from fastapi.security import OAuth2PasswordRequestForm
from auth import create_access_token, verify_password, get_current_user
USERS_DB = {
"alice": {
"username": "alice",
"hashed_password": "$2b$12$EixZaYVK1fsbw1ZfbX3OXePaWxn96p36WQoeG6Lruj3vjPGga31lW",
"scopes": ["tools:read", "tools:execute", "resources:read"]
}
}
@app.post("/token")
async def login(form_data: OAuth2PasswordRequestForm = Depends()):
"""OAuth2 token endpoint"""
user = USERS_DB.get(form_data.username)
if not user or not verify_password(form_data.password, user["hashed_password"]):
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Incorrect username or password",
headers={"WWW-Authenticate": "Bearer"},
)
access_token = create_access_token(
data={"sub": user["username"], "scopes": user["scopes"]}
)
return {"access_token": access_token, "token_type": "bearer"}
This endpoint accepts username and password credentials, validates them, and returns a JWT access token. Clients can then include this token in the Authorization header for subsequent requests.
Update your route handlers to require authentication:
@app.post("/message")
async def handle_message(
request: Request,
current_user: dict = Depends(get_current_user)
):
"""Handle JSON-RPC messages with OAuth authentication"""
logger.info(f"Message from user: {current_user['username']}")
# ... rest of the handler
Implementing Rate Limiting
Rate limiting prevents abuse by restricting how many requests a client can make within a time window. Install slowapi with uv add slowapi.
Add rate limiting to server_remote.py:
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from slowapi.errors import RateLimitExceeded
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
@app.post("/message")
@limiter.limit("10/minute")
async def handle_message(
request: Request,
current_user: dict = Depends(get_current_user)
):
"""Handle messages with rate limiting"""
# ... handler code
This limits each IP address to 10 requests per minute. You can customize the rate limit based on your needs. For authenticated endpoints, you might want to rate limit based on user identity rather than IP address:
def get_user_identifier(request: Request) -> str:
"""Get user identifier for rate limiting"""
user = request.state.user if hasattr(request.state, "user") else None
return user.get("username", get_remote_address(request)) if user else get_remote_address(request)
limiter = Limiter(key_func=get_user_identifier)
Input Sanitization and Validation
Beyond Pydantic validation, you should sanitize inputs to prevent injection attacks. For file paths, use pathlib to normalize and validate paths:
from pathlib import Path
ALLOWED_BASE_PATH = Path("/safe/directory")
def validate_file_path(path_str: str) -> Path:
"""Validate and sanitize file path"""
try:
requested_path = Path(path_str).resolve()
if not requested_path.is_relative_to(ALLOWED_BASE_PATH):
raise ValueError(f"Access denied: path must be within {ALLOWED_BASE_PATH}")
if not requested_path.exists():
raise ValueError(f"Path does not exist: {path_str}")
if not requested_path.is_file():
raise ValueError(f"Path is not a file: {path_str}")
return requested_path
except Exception as e:
raise ValueError(f"Invalid path: {str(e)}")
Use this function in your file reading tool:
async def read_file_tool(path: str) -> str:
"""Read a file with strict path validation"""
try:
validated_path = validate_file_path(path)
with open(validated_path, 'r', encoding='utf-8') as f:
content = f.read()
return f"File contents of '{path}':\n\n{content}"
except ValueError as e:
return f"Error: {str(e)}"
except Exception as e:
logger.error(f"Error reading file: {e}", exc_info=True)
return f"Error reading file: {str(e)}"
For database queries, always use parameterized queries to prevent SQL injection. Never concatenate user input directly into SQL strings:
# WRONG - vulnerable to SQL injection
query = f"SELECT * FROM users WHERE username = '{username}'"
# CORRECT - uses parameterized query
query = "SELECT * FROM users WHERE username = ?"
cursor.execute(query, (username,))
Implementing Audit Logging
Comprehensive audit logging tracks who did what and when, which is essential for security monitoring and compliance. Create a structured logging system that captures all important events:
import json
from datetime import datetime
def audit_log(event_type: str, user: str, details: dict):
"""Log an audit event"""
event = {
"timestamp": datetime.utcnow().isoformat(),
"event_type": event_type,
"user": user,
"details": details
}
logger.info(f"AUDIT: {json.dumps(event)}")
Use this function throughout your server:
@app.post("/message")
async def handle_message(
request: Request,
current_user: dict = Depends(get_current_user)
):
"""Handle messages with audit logging"""
message = await request.json()
method = message.get("method")
audit_log(
event_type="mcp_request",
user=current_user["username"],
details={
"method": method,
"params": message.get("params", {})
}
)
# ... handle the request
audit_log(
event_type="mcp_response",
user=current_user["username"],
details={
"method": method,
"success": True
}
)
Store audit logs in a secure, tamper-proof location. Consider sending them to a dedicated logging service or SIEM system for analysis and long-term retention.
Network Security
Use network-level security controls to restrict access to your MCP server. If the server should only be accessible from specific IP addresses or networks, configure firewall rules to enforce this. For cloud deployments, use security groups or network policies to limit access.
For Kubernetes, create a NetworkPolicy that restricts which pods can communicate with your MCP server:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: mcp-server-policy
spec:
podSelector:
matchLabels:
app: mcp-server
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
role: mcp-client
ports:
- protocol: TCP
port: 8000
This policy only allows pods with the label role: mcp-client to connect to your MCP server pods on port 8000.
For additional security, consider implementing mutual TLS, where both the client and server authenticate each other using certificates. This provides strong authentication and encryption without relying on passwords or API keys.
Chapter Nine: Testing and Quality Assurance
Thorough testing ensures your MCP server works correctly and continues to work as you make changes. This chapter covers unit testing, integration testing, and end-to-end testing strategies.
Unit Testing Tool Functions
Unit tests verify that individual functions work correctly in isolation. Create a file named test_server.py:
import pytest
from server import calculator_tool, web_search_tool, read_file_tool
@pytest.mark.asyncio
async def test_calculator_add():
"""Test calculator addition"""
result = await calculator_tool("add", 10, 5)
assert "15" in result
assert "Result:" in result
@pytest.mark.asyncio
async def test_calculator_subtract():
"""Test calculator subtraction"""
result = await calculator_tool("subtract", 10, 5)
assert "5" in result
@pytest.mark.asyncio
async def test_calculator_multiply():
"""Test calculator multiplication"""
result = await calculator_tool("multiply", 10, 5)
assert "50" in result
@pytest.mark.asyncio
async def test_calculator_divide():
"""Test calculator division"""
result = await calculator_tool("divide", 10, 5)
assert "2" in result
@pytest.mark.asyncio
async def test_calculator_divide_by_zero():
"""Test calculator division by zero handling"""
result = await calculator_tool("divide", 10, 0)
assert "Error" in result
assert "Division by zero" in result
@pytest.mark.asyncio
async def test_calculator_invalid_operation():
"""Test calculator with invalid operation"""
result = await calculator_tool("modulo", 10, 5)
assert "Error" in result
assert "Unknown operation" in result
@pytest.mark.asyncio
async def test_web_search():
"""Test web search returns results"""
result = await web_search_tool("test query", 3)
assert "Result 1" in result
assert "test query" in result
assert "URL:" in result
@pytest.mark.asyncio
async def test_web_search_max_results():
"""Test web search respects max_results"""
result = await web_search_tool("test", 2)
assert "Result 1" in result
assert "Result 2" in result
@pytest.mark.asyncio
async def test_read_file_not_found():
"""Test file reading with nonexistent file"""
result = await read_file_tool("/nonexistent/file.txt")
assert "Error" in result
assert "not found" in result.lower()
Install pytest and pytest-asyncio with uv add --dev pytest pytest-asyncio. The --dev flag indicates these are development dependencies that should not be included in production deployments.
Run the tests with uv run pytest test_server.py. You should see output indicating which tests passed and failed. Aim for comprehensive test coverage that exercises both normal operation and error conditions.
Integration Testing
Integration tests verify that different components work together correctly. Create test_integration.py:
import pytest
from mcp.client import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
@pytest.mark.asyncio
async def test_server_integration():
"""Test full client-server integration"""
server_params = StdioServerParameters(
command="uv",
args=["run", "server.py"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
tools_result = await session.list_tools()
assert len(tools_result.tools) == 3
tool_names = [tool.name for tool in tools_result.tools]
assert "calculator" in tool_names
assert "web_search" in tool_names
assert "read_file" in tool_names
result = await session.call_tool(
"calculator",
arguments={"operation": "multiply", "a": 6, "b": 7}
)
assert len(result.content) > 0
assert "42" in result.content[0].text
@pytest.mark.asyncio
async def test_tool_error_handling():
"""Test that tool errors are handled gracefully"""
server_params = StdioServerParameters(
command="uv",
args=["run", "server.py"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
result = await session.call_tool(
"calculator",
arguments={"operation": "divide", "a": 10, "b": 0}
)
assert len(result.content) > 0
assert "Error" in result.content[0].text
These tests start an actual server process and communicate with it using the MCP client, verifying that the entire system works end-to-end.
Testing Remote Servers
For remote servers, create test_remote.py:
import pytest
import httpx
from client_remote import RemoteMCPClient
@pytest.mark.asyncio
async def test_remote_health_check():
"""Test remote server health check"""
client = RemoteMCPClient(
base_url="http://localhost:8000",
api_key="dev-key-12345"
)
health = await client.health_check()
assert health["status"] == "healthy"
@pytest.mark.asyncio
async def test_remote_list_tools():
"""Test listing tools from remote server"""
client = RemoteMCPClient(
base_url="http://localhost:8000",
api_key="dev-key-12345"
)
tools = await client.list_tools()
assert len(tools) == 3
tool_names = [tool["name"] for tool in tools]
assert "calculator" in tool_names
assert "web_search" in tool_names
assert "data_process" in tool_names
@pytest.mark.asyncio
async def test_remote_call_tool():
"""Test calling a tool on remote server"""
client = RemoteMCPClient(
base_url="http://localhost:8000",
api_key="dev-key-12345"
)
result = await client.call_tool(
"calculator",
{"operation": "add", "a": 15, "b": 27}
)
assert "42" in result
@pytest.mark.asyncio
async def test_remote_authentication_required():
"""Test that authentication is required"""
async with httpx.AsyncClient() as http_client:
response = await http_client.get("http://localhost:8000/health")
assert response.status_code == 401
@pytest.mark.asyncio
async def test_remote_invalid_api_key():
"""Test that invalid API keys are rejected"""
async with httpx.AsyncClient() as http_client:
response = await http_client.get(
"http://localhost:8000/health",
headers={"X-API-Key": "invalid-key"}
)
assert response.status_code == 403
Before running these tests, start your remote server in a separate terminal. Then run uv run pytest test_remote.py.
Continuous Integration
Set up continuous integration to run tests automatically whenever you push code changes. Create a file named .github/workflows/test.yml for GitHub Actions:
name: Test
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install uv
run: pip install uv
- name: Install dependencies
run: uv sync
- name: Run unit tests
run: uv run pytest test_server.py -v
- name: Run integration tests
run: uv run pytest test_integration.py -v
This workflow runs on every push to the main branch and on every pull request. It sets up Python, installs dependencies, and runs your test suite. If any tests fail, the workflow fails, alerting you to the problem.
Chapter Ten: Summary and Next Steps
Congratulations on completing this comprehensive tutorial on the Model Context Protocol. You have learned how to build both local and remote MCP servers, create clients to interact with them, integrate with Claude Desktop, deploy to production environments, implement security best practices, and test your implementations thoroughly.
What You Have Accomplished
You now understand what the Model Context Protocol is and why it matters for AI application development. MCP provides a standardized way for AI systems to interact with external tools and data sources, enabling capabilities that would otherwise require custom integration work for every AI application.
You have built a local MCP server using STDIO transport that exposes tools for calculation, web search, and file reading. This server can run on your local machine and be used by desktop applications like Claude Desktop.
You have built a remote MCP server using HTTP transport with Server-Sent Events that can be accessed over the network. This server includes authentication, logging, and multiple tools including a data processing capability that demonstrates handling complex data structures.
You have created a Python client that can connect to remote MCP servers, list available tools and resources, execute tools with parameters, and handle errors gracefully. This client demonstrates how to interact with MCP servers programmatically.
You have configured Claude Desktop to connect to both local and remote servers, allowing you to use your custom tools in natural language conversations with Claude.
You have learned deployment strategies including Docker containerization, Kubernetes orchestration, and cloud platform deployment. You understand how to add HTTPS using Nginx reverse proxy and how to manage configuration using environment variables.
You have implemented security best practices including OAuth authentication, rate limiting, input validation, audit logging, and network security controls. These practices ensure your servers are safe to deploy in production environments.
You have created comprehensive test suites including unit tests for individual functions, integration tests for full client-server communication, and tests for remote server functionality. You understand how to set up continuous integration to run tests automatically.
Extending Your MCP Servers
Now that you understand the fundamentals, you can extend your servers with capabilities specific to your needs. Consider adding tools that connect to databases using libraries like SQLAlchemy or psycopg2. You could create tools that query your company's database and return results formatted for AI consumption.
Consider adding tools that integrate with external APIs. You could connect to weather services, stock market data providers, translation services, or any other API that provides useful information. The web search tool in this tutorial is a template you can adapt for real API integration.
Consider adding tools that perform complex computations or data analysis. You could integrate with pandas for data manipulation, scikit-learn for machine learning, or matplotlib for visualization generation. These tools could accept datasets as input and return analysis results or generated charts.
Consider adding resources that provide reference information. You could expose API documentation, configuration files, code examples, or knowledge base articles as resources that the AI can read and reference when answering questions.
Consider adding prompts that help structure specific workflows. You could create prompts for code review, documentation generation, data analysis, or any other task that benefits from a structured approach.
Exploring Advanced Features
The MCP specification includes several advanced features we did not cover in depth in this tutorial. The sampling capability allows servers to request LLM completions from the host, enabling recursive workflows where tools can ask the AI for help with subtasks. This opens up powerful possibilities for complex multi-step operations.
The tasks feature provides support for long-running asynchronous operations. Instead of blocking while waiting for a slow operation to complete, your tool can return a task identifier that the client can use to check status and retrieve results later. This is essential for operations that might take minutes or hours.
The incremental scope consent feature allows fine-grained permission management where users can approve specific capabilities as they are needed rather than granting blanket access upfront. This improves security by following the principle of least privilege.
The icons metadata feature allows you to provide visual representations of your tools, resources, and prompts. This enhances user interfaces by making it easier to identify capabilities at a glance.
Building for Your Organization
If you work at a large organization, consider how MCP servers could provide AI access to internal systems and data. You could build servers that connect to internal databases, expose internal APIs, access document repositories, or integrate with business intelligence tools.
Think about the workflows that knowledge workers in your organization perform regularly. Could any of these be enhanced by AI assistance? Could an MCP server provide the AI with access to the information and capabilities needed to help with these workflows?
Consider building a library of MCP servers that different teams can use. A database team might build servers for data access. A DevOps team might build servers for infrastructure management. A documentation team might build servers for accessing and searching documentation. Each team contributes servers that expose their domain expertise to AI systems.
Think about governance and security requirements. How will you manage authentication and authorization? How will you ensure compliance with data protection regulations? How will you audit AI access to sensitive systems? The security practices covered in this tutorial provide a foundation, but you may need additional controls for your specific environment.
Staying Current with MCP
The Model Context Protocol is actively evolving. The specification is updated periodically with new features and clarifications. The Python and TypeScript SDKs are regularly updated to support new specification versions and fix bugs.
Follow the official MCP GitHub repository to stay informed about updates. The repository includes the specification, SDK implementations, example servers, and documentation. Issues and pull requests provide insight into what features are being developed and what problems are being addressed.
Join the MCP community to learn from other developers and share your own experiences. Community forums, Discord servers, and other discussion platforms provide opportunities to ask questions, share solutions, and collaborate on common challenges.
Experiment with new features as they become available. The best way to understand new capabilities is to try them in your own projects. Build small proof-of-concept servers that explore new features before incorporating them into production systems.
Final Thoughts
The Model Context Protocol represents an important step toward making AI systems more capable and useful by giving them safe, structured access to external information and capabilities. By understanding how to build MCP servers and clients, you are well-positioned to create AI-enhanced applications that can interact with the real world.
The skills you have learned in this tutorial are broadly applicable. The patterns for building servers, handling authentication, validating inputs, and testing functionality apply to many types of applications beyond MCP. The experience of working with asynchronous Python, FastAPI, and modern deployment practices will serve you well in many contexts.
Most importantly, you now have the knowledge to experiment and innovate. The examples in this tutorial are starting points, not limitations. Use them as templates to build servers that solve real problems in your domain. Share what you build with others. Contribute back to the MCP ecosystem. Help shape the future of how AI systems interact with the world.
Thank you for working through this tutorial. I hope it has been valuable and that you feel confident building with the Model Context Protocol. Happy building!
No comments:
Post a Comment