Exposing an AI Agent as an MCP Server with MAF — Part 4 : Building a Full-Stack AI Agent Server: Three Interfaces from One Tools List

The Problem with Single-Interface Agents#

The previous posts in this series showed how to expose a MAF ChatAgent as an MCP server using agent.as_mcp_server(). That works well for AI clients — another agent, VS Code Copilot, or Claude Desktop can connect and send natural language queries. But in a real enterprise environment, not every consumer of your agent is an AI client.

A traditional monitoring dashboard wants to call search_incident over HTTP. A CI pipeline wants to call create_incident via a REST POST. An operations team wants to discover what agents exist and what they can do. None of these consumers speak MCP, and none of them want to reason through natural language — they want structured, predictable HTTP endpoints.

The natural response is to run separate services for separate audiences. But that means duplicating your tool definitions, maintaining multiple codebases, and keeping schemas in sync across services. There is a better way.

This post shows how to build a single server that, from one tools list and one startup command, gives your agent three simultaneous interfaces — MCP for AI clients, REST for traditional services, and a registry for service discovery.

The Architecture#

The goal is a single server.py that on startup builds and runs three surfaces simultaneously:

1
python server.py
2
    │
3
    ├── Registry   (port 8002) — service catalogue, self-registers on boot
4
    ├── REST API   (port 8000) — each tool as its own HTTP endpoint
5
    └── MCP Server (port 8001) — agent as a single opaque tool via as_mcp_server()

All three are derived from the same tools list. The ChatAgent is created once. The REST API and registry are built by inspecting the tool function signatures and Annotated type hints directly — no schema is ever written twice.

The Tools#

The same tools.py from the rest of this series powers everything. The @ai_function decorator and Annotated type hints serve triple duty — they define the REST request schemas, the registry tool entries, and the ChatAgent’s internal tool set all at once:

1
from typing import Annotated
2
from agent_framework import ai_function
3

4
@ai_function
5
def create_incident(
6
    short_description: Annotated[str, "Brief summary of the issue (5–160 chars)."],
7
    urgency: Annotated[str, "Urgency: 1=High, 2=Medium, 3=Low."] = "2",
8
) -> Annotated[str, "Confirmation with incident number and details."]:
9
    """Creates a new incident in ServiceNow."""
10
    return (
11
        f"Incident created.\n"
12
        f"  Number:      INC0012345\n"
13
        f"  Description: {short_description}\n"
14
        f"  Urgency:     {urgency}\n"
15
        f"  Status:      New"
16
    )
17

18
@ai_function
19
def update_incident(
20
    incident_number: Annotated[str, "Incident ID e.g. INC0012345."],
21
    state: Annotated[str, "New state: In Progress | Resolved | Closed."],
22
    notes: Annotated[str, "Work notes to append (max 4000 chars)."] = "",
23
) -> Annotated[str, "Confirmation with incident number, new state, and notes."]:
24
    """Updates the state and notes of an existing ServiceNow incident."""
25
    return (
26
        f"Incident updated.\n"
27
        f"  Number: {incident_number}\n"
28
        f"  State:  {state}\n"
29
        f"  Notes:  {notes or 'None added'}"
30
    )
31

32
@ai_function
33
def search_incident(
34
    query: Annotated[str, "Keyword, number, or phrase to search (1–200 chars)."],
35
) -> Annotated[str, "Matching incidents with number, description, state, and urgency."]:
36
    """Searches ServiceNow incidents by keyword or number."""
37
    return (
38
        f"Results for '{query}':\n"
39
        f"  1. INC0012345 — Login issue       — New         — High\n"
40
        f"  2. INC0012346 — VPN not working   — In Progress — Medium\n"
41
        f"  3. INC0012347 — Printer offline   — Resolved    — Low"
42
    )

The Agent Config#

Rather than hardcoding values throughout the file, the agent is described in a single config dict. This makes it straightforward to later drive the whole server from a YAML file or environment variables:

1
AGENT_CONFIG = {
2
    "name":         "ServiceNowAgent",
3
    "display_name": "ServiceNow Incident Manager",
4
    "description":  "Manages ServiceNow incidents — create, update, search.",
5
    "version":      "1.0.0",
6
    "tags":         ["servicenow", "itsm"],
7
    "module":       "tools",
8
    "tools":        ["create_incident", "update_incident", "search_incident"],
9
    "instructions": "You are a ServiceNow assistant. Use your tools to manage incidents.",
10
}

Building the Three Surfaces#

Surface 1: REST API#

build_rest_api() uses Python’s inspect module to read each tool function’s signature and type hints and auto-generate FastAPI routes from them. No route is written by hand. Write operations — anything with create, update, delete, or set in the name — become POST endpoints with an auto-generated Pydantic request body. Read operations become GET endpoints with query parameters. Swagger UI at /docs is included automatically by FastAPI.

1
def build_rest_api(tools: list, display_name: str) -> FastAPI:
2
    app = FastAPI(title=f"{display_name} API", version="1.0")
3

4
    for tool in tools:
5
        fn       = _unwrap(tool)
6
        sig      = inspect.signature(fn)
7
        hints    = get_type_hints(fn, include_extras=True)
8
        is_write = any(k in fn.__name__ for k in ("create", "update", "delete", "set"))
9

10
        fields = {}
11
        for pname, param in sig.parameters.items():
12
            base_type, _ = _unwrap_annotated(hints.get(pname, str))
13
            default = ... if param.default is inspect.Parameter.empty else param.default
14
            fields[pname] = (base_type, default)
15

16
        if is_write:
17
            Model = create_model(f"{fn.__name__}_req", **fields)
18
            def make_post(f, M):
19
                def handler(body: M):
20
                    return {"result": f(**body.model_dump())}
21
                handler.__name__ = f.__name__
22
                return handler
23
            app.add_api_route(f"/{fn.__name__}", make_post(fn, Model), methods=["POST"],
24
                              summary=inspect.getdoc(fn) or fn.__name__)
25
        else:
26
            def make_get(f, qfields):
27
                def handler(**kwargs):
28
                    return {"result": f(**kwargs)}
29
                handler.__name__ = f.__name__
30
                handler.__signature__ = inspect.Signature([
31
                    inspect.Parameter(k, inspect.Parameter.KEYWORD_ONLY, default=d, annotation=t)
32
                    for k, (t, d) in qfields.items()
33
                ])
34
                return handler
35
            app.add_api_route(f"/{fn.__name__}", make_get(fn, fields), methods=["GET"],
36
                              summary=inspect.getdoc(fn) or fn.__name__)
37

38
    return app

The _unwrap() helper peels back the @ai_function decorator to expose the plain Python function underneath, since inspect needs the raw function to read its signature correctly.

Surface 2: MCP Server#

build_mcp_app() is where as_mcp_server() comes in. It wraps the ChatAgent in an HTTP/SSE Starlette application using the same transport pattern covered in Part 2 of this series. The critical point is that this surface exposes the agent as a single entity — not the individual tool functions:

1
def build_mcp_app(agent: ChatAgent) -> Starlette:
2
    mcp_server = agent.as_mcp_server()
3
    sse        = SseServerTransport("/messages")
4

5
    async def handle_sse(request):
6
        async with sse.connect_sse(request.scope, request.receive, request._send) as streams:
7
            await mcp_server.run(
8
                streams[0], streams[1],
9
                mcp_server.create_initialization_options(),
10
            )
11

12
    async def handle_messages(request):
13
        await sse.handle_post_message(request.scope, request.receive, request._send)
14

15
    return Starlette(routes=[
16
        Route("/sse",      handle_sse,      methods=["GET"]),
17
        Route("/messages", handle_messages, methods=["POST"]),
18
    ])

An AI client connecting to port 8001 and calling list_tools() will see exactly one entry — the agent itself:

1
[
2
  {
3
    "name": "ServiceNowAgent",
4
    "description": "Manages ServiceNow incidents — create, update, search.",
5
    "inputSchema": {
6
      "type": "object",
7
      "properties": {
8
        "query": { "type": "string" }
9
      }
10
    }
11
  }
12
]

The individual functions are invisible from this interface. They are internal implementation details of the agent, as covered in depth in Part 3.

Surface 3: Registry#

The registry is a lightweight FastAPI service that acts as a service catalogue. It supports registering, listing, looking up, and deregistering agents via simple HTTP endpoints. The server self-registers on startup and deregisters cleanly on shutdown:

1
registry_app = FastAPI(title="Registry", version="1.0")
2
_registry: dict = {}
3

4
@registry_app.get("/registry")
5
def list_agents():
6
    return {"total": len(_registry), "agents": list(_registry.values())}
7

8
@registry_app.get("/registry/{name}")
9
def get_agent(name: str):
10
    if name not in _registry:
11
        raise HTTPException(404, f"Agent '{name}' not found.")
12
    return _registry[name]
13

14
@registry_app.get("/registry/{name}/tools")
15
def list_tools_endpoint(name: str):
16
    if name not in _registry:
17
        raise HTTPException(404, f"Agent '{name}' not found.")
18
    return {"agent": name, "tools": _registry[name]["tools"]}
19

20
@registry_app.post("/registry/register")
21
def register(payload: dict):
22
    _registry[payload["name"]] = {
23
        **payload,
24
        "registered_at": datetime.now(timezone.utc).isoformat(),
25
    }
26
    return {"registered": payload["name"]}
27

28
@registry_app.delete("/registry/{name}")
29
def deregister(name: str):
30
    _registry.pop(name, None)
31
    return {"deregistered": name}

The Critical Design Decision: What Gets Registered#

This is the most important subtlety in the whole approach. When the server self-registers, what tool schemas does it send to the registry?

It could register the single opaque ServiceNowAgent(query) entry — the same thing list_tools() on the MCP server returns. But that would make the registry useless for anyone trying to understand what the agent can actually do.

Instead, build_registration() registers the individual tool schemas derived from the @ai_function type hints — the full parameter names, types, descriptions, and required fields for each function:

1
def build_registration(cfg: dict, tools: list) -> dict:
2
    return {
3
        "name":         cfg["name"],
4
        "display_name": cfg["display_name"],
5
        "description":  cfg["description"],
6
        "version":      cfg["version"],
7
        "tags":         cfg.get("tags", []),
8
        "interfaces": {
9
            "rest": f"http://localhost:{REST_PORT}/docs",
10
            "mcp":  f"http://localhost:{MCP_PORT}/sse",
11
        },
12
        "tools": [_tool_schema(t) for t in tools],  # individual schemas, not the agent wrapper
13
    }

This means GET /registry/ServiceNowAgent/tools returns the full, human-readable schema for create_incident, update_incident, and search_incident. The registry is for discovery and documentation. The MCP server’s opaqueness is a runtime concern; the registry should always reflect what the agent can actually do at the function level.

The three interfaces intentionally present different views of the same agent:

1
Registry   (port 8002) → individual tools, full schemas  — for discovery
2
MCP Server (port 8001) → single agent entry              — for AI delegation
3
REST API   (port 8000) → individual endpoints            — for direct HTTP calls

Putting It All Together: main()#

The main() function loads the tools once and passes them to all three builders. Three uvicorn servers start concurrently with asyncio.create_task(), a brief sleep gives them time to initialise, and then the agent self-registers. On shutdown — whether from Ctrl+C or a crash — the finally block deregisters cleanly:

1
async def main():
2
    cfg   = AGENT_CONFIG
3
    tools = load_tools(cfg["module"], cfg["tools"])
4

5
    rest_app = build_rest_api(tools, cfg["display_name"])
6
    agent    = ChatAgent(
7
        chat_client=make_client(),
8
        name=cfg["name"],
9
        description=cfg["description"],
10
        instructions=cfg["instructions"],
11
        tools=tools,
12
    )
13
    mcp_app = build_mcp_app(agent)
14

15
    servers = {
16
        "registry": uvicorn.Server(uvicorn.Config(registry_app, host="0.0.0.0", port=REGISTRY_PORT, log_level="warning")),
17
        "rest":     uvicorn.Server(uvicorn.Config(rest_app,     host="0.0.0.0", port=REST_PORT,     log_level="warning")),
18
        "mcp":      uvicorn.Server(uvicorn.Config(mcp_app,      host="0.0.0.0", port=MCP_PORT,      log_level="warning")),
19
    }
20

21
    tasks = {k: asyncio.create_task(s.serve()) for k, s in servers.items()}
22
    await asyncio.sleep(0.5)
23

24
    async with httpx.AsyncClient() as client:
25
        await client.post(
26
            f"http://localhost:{REGISTRY_PORT}/registry/register",
27
            json=build_registration(cfg, tools),
28
        )
29

30
    print(f"  [{cfg['name']}] Registry  → http://localhost:{REGISTRY_PORT}/registry")
31
    print(f"  [{cfg['name']}] REST API  → http://localhost:{REST_PORT}/docs")
32
    print(f"  [{cfg['name']}] MCP       → http://localhost:{MCP_PORT}/sse")
33

34
    try:
35
        await asyncio.gather(*tasks.values())
36
    finally:
37
        async with httpx.AsyncClient() as client:
38
            await client.delete(
39
                f"http://localhost:{REGISTRY_PORT}/registry/{cfg['name']}"
40
            )
41
        for s in servers.values():
42
            s.should_exit = True
43
        await asyncio.gather(*tasks.values(), return_exceptions=True)

Running It#

1
python server.py

1
  [ServiceNowAgent] Registry  → http://localhost:8002/registry
2
  [ServiceNowAgent] REST API  → http://localhost:8000/docs
3
  [ServiceNowAgent] MCP       → http://localhost:8001/sse
4
  [ServiceNowAgent] registered.

All three interfaces are now live from a single process.

Using the REST API#

1
# Create an incident
2
curl -X POST http://localhost:8000/create_incident \
3
  -H "Content-Type: application/json" \
4
  -d '{"short_description": "VPN not working", "urgency": "1"}'
5

6
# Search incidents
7
curl "http://localhost:8000/search_incident?query=VPN"

Using the Registry#

1
# List all registered agents
2
curl http://localhost:8002/registry
3

4
# Get full tool schemas for ServiceNowAgent
5
curl http://localhost:8002/registry/ServiceNowAgent/tools

Using the MCP Server#

Any MCP-compatible client can connect to http://localhost:8001/sse. From a MAF client:

1
async with (
2
    MCPStreamableHTTPTool(
3
        name="ServiceNowAgent",
4
        url="http://localhost:8001/sse",
5
    ) as servicenow_mcp,
6
    ChatAgent(
7
        chat_client=make_client(),
8
        name="Client",
9
        instructions="Use the ServiceNow agent tool to handle requests.",
10
        tools=servicenow_mcp,
11
    ) as agent,
12
):
13
    result = await agent.run("Find any open VPN incidents and resolve them.")
14
    print(result)

What Each Interface Is For#

Interface	Port	Consumer	Input	Who reasons?
Registry	8002	Any service doing discovery	—	—
REST API	8000	Dashboards, scripts, pipelines	Structured JSON	Nobody — direct execution
MCP Server	8001	AI clients, other MAF agents	Natural language	Server’s ChatAgent

The Key Trade-off: Two LLM Calls for MCP#

Every MCP request through this server involves two LLM calls — one on the client side to decide to delegate to the agent, and one on the server side where the agent reasons about which function to call. This is inherent to the as_mcp_server() design and is discussed in detail in Part 3.

The trade-off is worth it when you need the server to handle compound or ambiguous requests that a single tool call cannot fulfil in one step. For example:

1
"Find any open VPN incidents and if there are any, resolve them all
2
 with note 'Fixed by network team'."

A client calling individual REST or MCP tools directly would need to orchestrate two sequential calls — search_incident first, then update_incident for each result. The agent server handles this in a single request because its internal LLM can reason across multiple tool calls in one turn before returning the final answer.

If your MCP clients are capable enough to pick the right low-level function directly and your requests are always simple and unambiguous, exposing individual tools via MCP (rather than wrapping them in an agent) would avoid the extra server-side LLM call. The REST API surface in this server already provides exactly that for non-AI consumers.

When to Use This Approach#

This pattern is the right choice when your agent needs to serve a mixed audience — AI clients that want to delegate via natural language, traditional services that want predictable REST endpoints, and an operations team that needs service discovery. It gives you all three without running separate processes or maintaining separate codebases.

It is particularly well-suited to enterprise deployments where AI agents sit alongside existing HTTP-based infrastructure. The REST API means your agent can be called from a Jira webhook, a ServiceNow business rule, or a CI pipeline with no AI client on the calling side. The MCP server means it can also be called by other agents in a multi-agent MAF system. The registry means both types of consumers can find it automatically.

Project Structure#

The complete project is just four files:

1
project/
2
├── tools.py     ← @ai_function tool definitions
3
├── server.py    ← REST + MCP + Registry, all from one tools list
4
├── client.py    ← MCPStreamableHTTPTool + ChatAgent (optional)
5
└── .env         ← Azure OpenAI credentials

One file, one process, three interfaces, zero duplication.

This post is part of a series on building multi-agent systems with Microsoft Agent Framework.

← Part 1: The Stdio Approach | ← Part 2: The HTTP/SSE Approach | ← Part 3: MAF vs Raw MCP