Sunday, 20 April 2025

A Brief Introduction to the Model Context Protocol#

Large Language Models (LLMs) do not have the ability to interact with external systems. They can't read files, access the internet, or make API calls.

$ llm "What is the top story on BBC right now?"
I'm unable to access real-time content, including current news stories from websites like BBC.

LLMs can't access data from external systems

However, if you provide an LLM with tools to do these things, it can express intent to use them.

$ llm "What is the top story on BBC right now?" --system '
  Available tools: ["fetch_bbc_stories"].
  Tools can be called with this syntax: TOOLS::{tool name}()'
TOOLS::fetch_bbc_stories()

LLMs can express intent to call tools

A client application can intercept the LLM's response ("TOOLS::fetch_bbc_stories()") to execute a tool call, and present it with the result.

$ llm --continue '["I have let myself & the team down - Norris on qualifying shunt", "King and Queen attend Easter Sunday church service", "Rosenberg: Is Putins Easter truce cause for scepticism or chance for peace?"]'
The top story on BBC right now is titled: **"I have let myself & the team down - Norris on qualifying shunt."**

Simulating results of a tool call

Many LLM vendors support such workflows within their API ¹ ², but their implementations are subtly different, and often incompatible with each other. The Model Context Protocol (MCP) aims to standardise this interface across all LLMs.

At a high level, here's how the pieces of an MCP setup fit together:

You build an MCP Server that exposes some functionality to an LLM.
Users connect to your server through a Client running in a Host application.
The client and server talk to each other by exchanging Messages over a Transport.
Servers can be local - a separate process on the same machine as the client, or remote.
- Local server run on the same machine as the client, but in a separate process. They communicate over the process's standard input/output (STDIO transport).
- Remote servers are reachable over HTTP, and may use long-lived SSE connections (SSE transport), or regular request/response style connections (Streamable HTTP transport).

During an interaction between a user and an LLM, the model may decide that it needs to use some functionality offered by the MCP server. This intent is captured by the host application, which uses an MCP client to relay this request to the server over a transport. The server then executes some code to serve this request, and responds to the client over the same transport. The host application then reads this response, and makes it available to the LLM.