Exposing an AI Agent as an MCP Server with MAF — Part 1: The Stdio Approach

What is MCP and Why Does It Matter?#

The Model Context Protocol (MCP) is an open standard that defines how applications provide tools and contextual data to large language models. Think of it as a universal plug — once your agent speaks MCP, any MCP-compatible client can connect to it and use it as a tool. That includes VS Code GitHub Copilot, Claude Desktop, other agents, or your own custom client.

Before MCP, every AI framework had its own proprietary way of wiring tools to models. MCP standardises this so that an agent built in MAF can be used as a tool by a LangChain agent, a VS Code extension can call your custom ServiceNow agent, and multiple clients can share one agent server without any custom integration code.

The protocol has two sides. An MCP Server exposes tools and handles requests — this is what agent.as_mcp_server() creates. An MCP Client discovers those tools and calls them — this is MCPStdioTool or MCPStreamableHTTPTool in MAF.

This is a three-part series. Part 1 covers the Stdio approach for local development. Part 2 covers the HTTP/SSE approach for running client and server as fully independent processes. Part 3 compares as_mcp_server() against a raw MCP SDK implementation to explain the architectural difference.

The Example: A ServiceNow Incident Agent#

Throughout this series we use the same ServiceNow agent with three tools — create, update, and search incidents. The tools are defined using @ai_function with Annotated type hints that embed parameter descriptions directly in the code:

1
from typing import Annotated
2
from agent_framework import ai_function
3

4
@ai_function
5
def create_incident(
6
    short_description: Annotated[str, "Brief summary of the issue (5–160 chars)."],
7
    urgency: Annotated[str, "Urgency: 1=High, 2=Medium, 3=Low."] = "2",
8
) -> Annotated[str, "Confirmation with incident number and details."]:
9
    """Creates a new incident in ServiceNow."""
10
    return (
11
        f"Incident created.\n"
12
        f"  Number:      INC0012345\n"
13
        f"  Description: {short_description}\n"
14
        f"  Urgency:     {urgency}\n"
15
        f"  Status:      New"
16
    )
17

18
@ai_function
19
def update_incident(
20
    incident_number: Annotated[str, "Incident ID e.g. INC0012345."],
21
    state: Annotated[str, "New state: In Progress | Resolved | Closed."],
22
    notes: Annotated[str, "Work notes to append (max 4000 chars)."] = "",
23
) -> Annotated[str, "Confirmation with incident number, new state, and notes."]:
24
    """Updates the state and notes of an existing ServiceNow incident."""
25
    return (
26
        f"Incident updated.\n"
27
        f"  Number: {incident_number}\n"
28
        f"  State:  {state}\n"
29
        f"  Notes:  {notes or 'None added'}"
30
    )
31

32
@ai_function
33
def search_incident(
34
    query: Annotated[str, "Keyword, number, or phrase to search (1–200 chars)."],
35
) -> Annotated[str, "Matching incidents with number, description, state, and urgency."]:
36
    """Searches ServiceNow incidents by keyword or number."""
37
    return (
38
        f"Results for '{query}':\n"
39
        f"  1. INC0012345 — Login issue       — New         — High\n"
40
        f"  2. INC0012346 — VPN not working   — In Progress — Medium\n"
41
        f"  3. INC0012347 — Printer offline   — Resolved    — Low"
42
    )

The shared Azure OpenAI client helper is used in every example across all three parts:

1
import os
2
from openai import AsyncAzureOpenAI
3
from agent_framework.openai import OpenAIChatClient
4

5
def make_client() -> OpenAIChatClient:
6
    return OpenAIChatClient(
7
        model_id=os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4o"),
8
        async_client=AsyncAzureOpenAI(
9
            azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
10
            api_key=os.environ["AZURE_OPENAI_API_KEY"],
11
            api_version="2024-12-01-preview",
12
        ),
13
    )

What `agent.as_mcp_server()` Actually Does#

Before diving into the transport approaches, it is worth understanding what this method does under the hood.

When you call agent.as_mcp_server(), MAF wraps your ChatAgent in an MCP-compliant server object. That server advertises the agent’s name and description as MCP metadata, and exposes a single tool that represents the entire agent. Clients call this tool with a natural language query. Internally, the agent’s LLM receives that query, reasons about it, calls whichever of its underlying tools (create_incident, search_incident, etc.) are needed, and returns the result.

The key insight is that the entire agent becomes one MCP tool. From the outside, a client sees a single callable named ServiceNowAgent. Internally, that callable is a full LLM-powered reasoning loop. This is architecturally different from a standard MCP server — Part 3 goes deep on exactly how and why.

Approach 1: Stdio Transport#

Stdio (standard input/output) transport means the server and client communicate through stdin and stdout streams. The client spawns the server as a subprocess and pipes messages back and forth. They run on the same machine but as separate processes. This is the same pattern used by popular MCP servers like the GitHub MCP server and the filesystem MCP server.

Stdio is the right choice for local development and testing, for integrating with tools like VS Code Copilot Agents or Claude Desktop that manage MCP servers as subprocesses automatically, and for any scenario where you want a simple setup with no networking or open ports.

The Server#

1
"""
2
server.py — ServiceNow Agent as MCP Server (Stdio)
3
====================================================
4
Exposes the ServiceNow ChatAgent as an MCP server over stdio.
5
The client spawns this as a subprocess automatically.
6
You do not need to run this file manually.
7

8
Usage:
9
    python client.py   ← this is all you need
10
"""
11

12
import anyio
13
from dotenv import load_dotenv
14
from mcp.server.stdio import stdio_server
15
from agent_framework import ChatAgent
16
from tools import create_incident, update_incident, search_incident
17

18
load_dotenv()
19

20

21
async def main():
22
    agent = ChatAgent(
23
        chat_client=make_client(),
24
        name="ServiceNowAgent",
25
        description="Manages ServiceNow incidents — create, update, search.",
26
        instructions="You are a ServiceNow assistant. Use your tools to manage incidents.",
27
        tools=[create_incident, update_incident, search_incident],
28
    )
29

30
    # One line to turn the agent into an MCP server
31
    server = agent.as_mcp_server()
32

33
    async with stdio_server() as (read_stream, write_stream):
34
        await server.run(
35
            read_stream,
36
            write_stream,
37
            server.create_initialization_options(),
38
        )
39

40

41
if __name__ == "__main__":
42
    anyio.run(main)

Three details are worth noting. First, anyio.run() is used instead of asyncio.run() because the MCP SDK uses anyio as its async backend and you must match that on the server side. Second, there is no async with ChatAgent(...) as agent context manager — the MCP server manages the agent’s lifecycle through the stdio transport instead. Third, server.create_initialization_options() tells the MCP SDK what capabilities the server supports, derived automatically from the agent.

The Client#

1
"""
2
client.py — ServiceNow MCP Client (Stdio)
3
==========================================
4
Spawns server.py as a subprocess via MCPStdioTool and routes
5
questions through a local ChatAgent.
6

7
Run:
8
    python client.py
9
"""
10

11
import asyncio
12
from dotenv import load_dotenv
13
from agent_framework import ChatAgent, MCPStdioTool
14

15
load_dotenv()
16

17

18
async def main():
19
    questions = [
20
        "Create a high urgency incident for a login failure affecting all users.",
21
        "Search for any VPN related incidents.",
22
        "Update incident INC0012346 to Resolved with note 'VPN config fixed'.",
23
    ]
24

25
    async with (
26
        MCPStdioTool(
27
            name="ServiceNowAgent",
28
            command="python",
29
            args=["server.py"],
30
        ) as servicenow_mcp,
31
        ChatAgent(
32
            chat_client=make_client(),
33
            name="Client",
34
            instructions="Use the ServiceNow agent tool to handle requests.",
35
            tools=servicenow_mcp,
36
        ) as agent,
37
    ):
38
        for question in questions:
39
            print(f"Q: {question}")
40
            result = await agent.run(question)
41
            print(f"A: {result}\n")
42

43

44
if __name__ == "__main__":
45
    asyncio.run(main())

MCPStdioTool is a tool provider, not a callable. A common mistake is trying to call it directly with something like await servicenow_mcp.call("Create an incident...") — this will fail with an AttributeError because MCPStdioTool has no call method. It must always be passed to a ChatAgent via the tools parameter. The ChatAgent is what reasons about the request and decides which MCP tool to invoke.

How It All Fits Together#

You only run one command:

1
python client.py

MCPStdioTool handles spawning server.py automatically. Here is the full request flow:

1
python client.py
2
    │
3
    ├── MCPStdioTool spawns: python server.py
4
    │       │
5
    │       └── server ChatAgent starts, listens on stdin/stdout
6
    │
7
    └── client ChatAgent receives question
8
            │
9
            └── decides to call "ServiceNowAgent" MCP tool
10
                    │
11
                    └── request sent to server.py via stdio
12
                            │
13
                            └── server ChatAgent reasons and calls tools
14
                                    │
15
                                    └── result flows back → printed

Note that there are two LLM calls per question — one on the client to decide to delegate, and one on the server to actually handle the request. This is the nature of the agent-as-tool pattern and is discussed further in Part 3.

Expected Output#

1
Q: Create a high urgency incident for a login failure affecting all users.
2
A: I've created a new high urgency incident. Incident INC0012345 has been
3
   raised with urgency set to High and status New.
4

5
Q: Search for any VPN related incidents.
6
A: I found a VPN-related incident: INC0012346 — VPN not working —
7
   In Progress — Medium urgency.
8

9
Q: Update incident INC0012346 to Resolved with note 'VPN config fixed'.
10
A: Incident INC0012346 has been updated to Resolved with the work note
11
   'VPN config fixed'.

Pros and Cons of the Stdio Approach#

Pros	Cons
Simple — one command to run	Server tied to client’s lifecycle
No networking or ports needed	Cannot serve multiple clients simultaneously
Works with VS Code Copilot and Claude Desktop out of the box	Not suitable for remote or cloud deployments
Client manages server lifecycle automatically	Server restarts fresh with every client run

What’s Next#

Part 2 covers the HTTP/SSE approach — running the server and client as completely independent processes over the network. This is the production-ready path that allows multiple clients to share one server, enables remote deployments, and gives you a persistent always-on MCP service.

➡️ Continue to Part 2: The HTTP/SSE Approach