What Is an MCP Server? Architecture and Citation Implications
An MCP server is a process that exposes tools, resources, and prompts to AI applications using the Model Context Protocol, a JSON-RPC based open standard introduced by Anthropic in November 2024 and subsequently adopted by OpenAI and Google DeepMind. It lets any compatible AI client connect to any compatible data source or capability without bespoke integration code.
TL;DR
An MCP server is the server side of the Model Context Protocol (MCP), an open standard that connects AI applications to external systems. The server publishes three primitives — tools, resources, and prompts — and the client (host application) discovers and invokes them. MCP servers run locally over stdio or remotely over Streamable HTTP. Compared to vendor-specific function calling or plugins, MCP servers are model-agnostic and reusable across clients. For content publishers, exposing an MCP server has citation implications: it positions a knowledge source as agent-readable infrastructure, not just a website.
Definition
An MCP server is a process that implements the server side of the Model Context Protocol and exposes a defined set of tools, resources, and prompts to MCP-compatible clients over a JSON-RPC transport. The Model Context Protocol itself is an open standard introduced by Anthropic in November 2024 and maintained as an open-source specification, with adoption by OpenAI and Google DeepMind shortly after launch.
The official documentation positions MCP as "a USB-C port for AI applications": a standardized way for any compatible AI client to connect to any compatible data source or capability without writing a bespoke integration for each pairing. The server is the side that publishes capabilities; the client is embedded in the host application (such as Claude Desktop, an IDE, or a custom agent runtime) and is the side that discovers and invokes them.
Functionally, an MCP server is small. It does three things:
- Advertises what it can do (tool list, resource list, prompt list).
- Responds to invocation requests from the client (tool call, resource read, prompt fill).
- Negotiates transport, capabilities, and lifecycle with the client over JSON-RPC.
What makes MCP servers strategically interesting is not the protocol mechanics but the decoupling: a single MCP server can serve many different clients (Claude Desktop, Cursor, custom agents, Slack bots) and a single client can talk to many MCP servers, regardless of which LLM is powering the client. This is the property that distinguishes MCP from earlier vendor-specific surfaces like OpenAI plugins.
Why it matters
MCP servers matter because they convert ad-hoc integrations into reusable infrastructure. Three forces drive their adoption:
- Integration sprawl. Connecting agents to tools and data traditionally requires a custom integration per pairing, creating fragmentation and duplicated effort that makes it difficult to scale truly connected systems. MCP collapses the M×N integration matrix into M+N: any client speaks the same protocol to any server.
- Model portability. Function-calling formats differ across vendors. Tool definitions written for OpenAI do not run unchanged on Anthropic, and vice versa. An MCP server, by contrast, is model-agnostic: switch the underlying LLM and the tools keep working.
- Citation surface. For content publishers and knowledge organisations, an MCP server is the agent-readable counterpart to a website. It is how an AI assistant retrieves your data, structured and with provenance, when its user asks about your domain. As agentic clients become a substantive share of traffic, owning a high-quality MCP server moves closer to owning a search-result citation than owning a navigational visit.
The protocol's adoption profile reinforces these incentives. Within months of launch, MCP was adopted by major AI providers including OpenAI and Google DeepMind, and a public registry of MCP servers emerged. For organisations evaluating where to invest in agent integrations, that adoption breadth is the practical signal that MCP is the lowest-risk standard to build against.
There is also a subtler reason: MCP servers shift the economics of agent capabilities. Before MCP, every agent vendor maintained its own integration catalogue and gatekept which tools their model could use. MCP turns capabilities into open infrastructure, which means tool builders — not model vendors — capture the long tail of agent functionality.
How it works
An MCP server is built around a small number of primitives, a transport, and a lifecycle. Together they define how a client connects to a server, learns what it can do, and invokes its capabilities.
Architecture
flowchart LR
H["Host application"] --> C["MCP client"]
C -- JSON-RPC --> T["Transport (stdio or HTTP)"]
T --> S["MCP server"]
S --> R["Resources"]
S --> O["Tools"]
S --> P["Prompts"]The host application embeds an MCP client; the client speaks JSON-RPC to the server over a transport; the server fronts the actual capabilities (resources, tools, prompts).
Primitives
MCP servers expose three primitives. Each has a discovery method and an invocation method.
| Primitive | What it represents | Typical use |
|---|---|---|
| Tools | Callable functions with typed inputs and outputs | Search a database; send a message; create a file |
| Resources | Readable data objects identified by URIs | A file, a database row, a knowledge base entry |
| Prompts | Pre-written templates with arguments | "Summarise document X using style Y" |
Tools are the most widely used primitive. Resources and prompts are less universally adopted by clients, partly because models are trained more heavily on tool-call patterns than on resource fetching, and partly because clients vary in how they expose resources to the model. In practice, most production servers prioritise tool design, then add resources where useful, and treat prompts as a discovery aid for end users.
Transports
MCP encodes messages with JSON-RPC and currently defines two standard transports for client-server communication:
- stdio: communication over standard input and output. Used for local servers running as child processes. Clients should support stdio whenever possible.
- Streamable HTTP: a stream-friendly HTTP transport for remote servers, typically used when the server is hosted as an independent service rather than spawned locally.
A Transport Working Group is actively exploring additional transports for enterprise-scale remote deployments as the ecosystem moves beyond locally launched processes toward distributed, at-scale deployments.
Lifecycle
A typical MCP session follows four phases:
- Initialisation. Client and server exchange protocol version, capabilities, and metadata.
- Discovery. Client lists tools, resources, and prompts the server advertises.
- Invocation. Client calls tools, reads resources, or fills prompts in response to model decisions.
- Shutdown. Client closes the transport; server releases held resources.
The lifecycle is symmetric and predictable, which is what allows tooling like the MCP Inspector and CLI utilities to interrogate any compliant server without server-specific knowledge.
Comparison vs related concepts
MCP is often compared with three adjacent surfaces: vendor function calling, OpenAI plugins, and direct API integration. The differences matter because they drive build-vs-adopt decisions.
| Approach | Scope | Portability | Discovery | Best for |
|---|---|---|---|---|
| MCP server | Cross-client, cross-model | High — model-agnostic | Standardised list/get/invoke | Tools shared across many clients/models |
| Function calling (OpenAI / Anthropic) | One model family | Low — vendor-specific format | Schema embedded per request | Tightly coupled, in-app tools |
| OpenAI plugins / GPT actions | One UI surface | Lowest — ChatGPT-only | OpenAPI manifest | Surfacing a service inside ChatGPT |
| Direct REST API | Any consumer | Highest at protocol level | Out-of-band docs | Programmatic, non-LLM consumers |
Three distinctions are most consequential:
- Reusability vs immediacy. Function calling is faster to build for a single model; MCP wins when the same tool needs to work across Claude Desktop, Cursor, custom Slack bots, and automated pipelines, because the tool is built once and connected many times.
- Governance. MCP centralises tool versioning and access control at the server. A change to a tool propagates to every client. With per-client function calling, the same change must be re-deployed in every consumer.
- Standardisation surface. Plugins and function calling embed the integration inside a vendor's product. MCP defines an external standard, which is why competing vendors have aligned around it and why the protocol has its own Wikipedia entry and registry.
MCP does not replace function calling; it sits one layer above. Internally, an MCP client typically still uses the host model's function-calling primitive to decide which MCP tool to invoke. The protocol replaces integration plumbing, not model-side tool selection.
Practical application
Building and operating an MCP server in production looks roughly like this:
1. Pick a use case with multi-client value
MCP wins when more than one client needs the same capability. Internal documentation search, product catalogue queries, and workflow automation are typical first targets. If only one client will ever use the tool, vendor function calling is often simpler.
2. Choose an SDK
Official SDKs exist for Python (FastMCP), TypeScript, Rust, Go, and others. The FastMCP class in Python uses type hints and docstrings to automatically generate tool definitions from regular function signatures, which dramatically reduces boilerplate.
3. Design tools first, resources second, prompts last
Design tools as small, single-purpose, idempotent operations with clear typed inputs. Avoid overloaded "do anything" tools — they are harder for models to choose correctly. Add resources for read-only data the agent benefits from quoting verbatim. Add prompts only when end users need help discovering how to use the tools.
4. Pick a transport
Use stdio for local development and personal-productivity scenarios (Claude Desktop, IDE integrations). Use Streamable HTTP for remote deployments where many users connect to a hosted server. The community has also experimented with passthrough patterns that bridge HTTP backends into MCP without rewriting the underlying service.
5. Add authentication and authorisation
For remote MCP servers, OAuth-based authorisation flows are increasingly common, with the MCP specification adding explicit authorisation guidance for HTTP transports. Decide what scopes each tool requires and surface them clearly so users can give consent.
6. Document for discovery
Publish a description, tool list, and example transcripts. Register on the public MCP registry if the server is intended for community use. Treat the server's documentation as part of the product surface — agents and humans both read it.
7. Test with multiple clients
A server that works in Claude Desktop may behave differently in Cursor or a custom Python client. Run end-to-end tests in at least two clients before shipping. Tools like the MCP CLI provide proxy and mock capabilities that simplify this testing.
Examples
- Anthropic reference servers. Anthropic maintains an open-source repository of reference MCP servers — small, audited implementations that demonstrate filesystem, fetch, memory, and other primitives. They are explicitly educational rather than production-grade.
- GitHub MCP server. The GitHub MCP server exposes repository operations as tools so agents can read issues, open pull requests, and inspect code. It can be run via container, illustrating how production-style MCP servers are packaged.
- Internal documentation search. A common enterprise pattern: an MCP server fronts a vector store over internal docs and exposes a single search_docs tool. Multiple agents (support bot, IDE assistant, executive copilot) connect to the same server and inherit identical retrieval behaviour.
- Database query server. A team-shared SQL tool exposed as an MCP server lets the data team build the query interface once. Engineering, support, and executive teams all use it through different clients, with access control managed at the server level.
- Weather API tutorial server. The official "Build an MCP server" guide walks through a weather server that wraps the U.S. National Weather Service API, illustrating how external REST APIs are wrapped as MCP tools.
- Passthrough HTTP server. Community implementations show how a Java-based passthrough server bridges any HTTP backend into the MCP ecosystem without rewriting the underlying service — useful for legacy systems.
Common mistakes
- Treating MCP as a model framework. MCP is integration plumbing, not an agent framework. It does not orchestrate planning, memory, or multi-step reasoning. Pair it with an agent runtime; do not expect the protocol to provide one.
- Overloading tools. A single "do_anything" tool with a free-text argument is harder for models to use correctly than five small typed tools. Models choose tools by name and signature; ambiguous names increase error rates.
- Skipping resources because they're underused. Resources are valuable when an agent needs to quote data verbatim with provenance. Skipping them entirely is a missed citation surface.
- Hard-coding to one client. A server that only works in Claude Desktop has thrown away the main reason to use MCP. Test in at least one alternate client before treating the server as cross-platform.
- Ignoring transport choice. stdio is great locally but does not survive a production multi-tenant deployment. Streamable HTTP needs explicit thinking about auth, rate limits, and observability.
- Conflating MCP servers with HTTP APIs. They are related but different. An HTTP API is consumer-agnostic; an MCP server is agent-shaped — tools are scoped, named, and described for model consumption. A direct port of a REST surface to MCP usually under-performs a purpose-built one.
FAQ
Q: Who created MCP, and is it really open?
MCP was created by Anthropic and released as an open standard in November 2024. The specification, schemas, SDKs, and reference servers are in public repositories under an MIT license, and the protocol has been adopted by OpenAI and Google DeepMind, which is the practical test of openness.
Q: What language do I write an MCP server in?
Any language with an SDK or that can speak JSON-RPC over stdio or HTTP. Official SDKs include Python, TypeScript, Rust, Go, Java, and Kotlin. The Python FastMCP SDK is the most beginner-friendly path; the TypeScript SDK is common for tools that already live in a Node ecosystem.
Q: How is an MCP server different from a tool definition in OpenAI function calling?
A function-calling tool definition lives inside one model's request format and is sent on every call. An MCP server lives outside the model and is discovered by clients. The same MCP server can be used by Claude, GPT, Gemini, or any other compliant client, while function-calling definitions are model-specific.
Q: Do I need a remote MCP server, or is local enough?
For personal-productivity tools that wrap local data (a developer's filesystem, a personal note system), stdio servers running locally are usually sufficient. For multi-user, multi-client deployments — especially across a team or product — a remote server over Streamable HTTP is the right shape, and the ecosystem is actively standardising additional remote transports.
Q: Are MCP servers safe to expose externally?
They can be, with proper auth, sandboxing, and rate limits, but the burden is on the server author. The reference servers Anthropic publishes are explicitly educational and not production-ready; production deployments need their own threat modelling and safeguards.
Q: What are the citation implications of an MCP server for a content publisher?
When agents use an MCP server to fetch content, the server controls structure, provenance, and versioning of what the model sees. Publishers that expose well-structured MCP resources — with stable URIs, clean text, and explicit metadata — give agents a more reliable surface to cite than scraped HTML. This makes MCP server design a strategic citation lever, not just an engineering choice.
Q: How does MCP relate to RAG?
They solve overlapping problems but at different layers. RAG is a pattern for grounding model outputs in retrieved text. MCP is a protocol for how the model talks to retrieval and other tools. A typical agent uses an MCP server to expose a search/retrieve tool, and the model uses that tool to perform RAG. RAG is the technique; MCP is the wire.
Q: Will MCP replace direct REST APIs for AI use cases?
It is more likely to layer on top of them. Many MCP servers are thin wrappers around REST APIs, exposing them in an agent-shaped way. The REST API still exists for non-LLM consumers; the MCP server gives the same capabilities a model-friendly interface with discovery, typed inputs, and consistent error semantics.
Related Articles
MCP Server Design for Content Publishers and Docs Teams
MCP server design patterns for content publishers: how to expose articles, search, and citation manifests to AI agents via Model Context Protocol.
MCP Server Onboarding Checklist
Ship an MCP server agents can pick up immediately: tool naming, schemas, examples, auth, and sandbox requirements in a single onboarding checklist.
MCP vs Function Calling vs OpenAI Plugins: AI Agent Tool Integration Architectures Compared
MCP vs function calling vs plugins compared for AI agent tool integration: discovery scope, maintainability, and documentation patterns for 2026 stacks.