Yeah MCP is the worst documented technology I have ever encountered. I understand APIs for calling LLMs, I understand tool calling APIs. Yet I have read so much about MCP and have zero fucking clue except vague marketing speak. Or code that has zero explanation. What an amateur effort.
I've given up, I don't care about MCP. I'll use tool calling APIs as I currently do.
It’s just JSON RPC between a client, one or more servers. The AI agent interaction is not part of what the protocol is designed for except for re-prompting requests made by tools. It has to be AI agnostic.
For tool call workflow: (a) client requests the list of tools from the known servers, then it forwards those (possibly after translating to API calls like OpenAI toolcall API) to any AI agents it wants; when the AI then wants to call a tool (b) it returns a request that needs to be forwarded to the MCP server for handling; and (c) return the result back to the AI.
The spec is actually so simple no SDK is even necessary you could just write a script in anything with an HTTP client library.
I've spent so much time clicking through pages and reading and not understanding, but without finding the spec. Thanks so much!
try it and you'll figure out
Will it be supplanted? Perhaps. But it's not going to die a natural death.
It's already gone so viral it's practically entrenched already, permanently. Everyone has invested too much time saying how much they love MCP. If we do find something cleaner it will still be called MCP, and it will be considered a 'variation' (new streaming type) on MCP rather than some competitor protocol replacing it. Maybe it will be called 'MCP 2.0' but it will be mostly the same and retain the name.
People often talk about web APIs, but we should also consider the integration of local tools. For me, the integration is mind-blowing.
When I tried the Playwright MCP integration [0][1] a few months ago, I really felt that after giving computers the ability to speak or communicate, we had now given them arms. I still get goosebumps thinking about it.
[0]https://youtu.be/3NWy_sxD3Vc [1]https://github.com/microsoft/playwright-mcp [EDIT]
Pasting in a product owner's AC and and watching it browse through our test env for a few minutes before spitting out a passing - and passable - spec+test was kind of mind blowing.
And prople are skipping on service discovery. Making ai know what steps / operation is good.
Same. To see apps reverse engineered by LLMs with Ghidra [0] blew me away. It CTFed-out hard-coded access tokens and keys from .so's in seconds.
Most of the material on MCP is either too specific or too in depth.
WTF is it?! (Other than a dependency by Anthropic)
that's the missing piece in most of these description.
You send off a description of the tools, the model decides if it wants to use one, then you run it with the args, send it back to the context and loop.
Unless I'm missing something major, it's just marginally more convenient than just hooking up tool calls for, say, OpenAPI. The power is probably in the hype around it more than it's on technical merits.
The reality is that the space is still really young and people are figuring things out as they go.
The number of people that have no real clue what they are doing that are jumping in is shocking. Relatedly, the number of people that can't see the value in a protocol specifically designed to work with LLM Tool Calling is equally shocking. Can you write code that glues an OpenAPI Server to an LLM-based Tool Calling Agent? 100%! Will that setup flood the context window of the LLM? Almost certainly. You need to write code to distill those OpenAPI responses down to some context the LLM can work with, respecting the limited space for context. Great, now you've written a wrapper on that OpenAPI server that does exactly that. And you've written, in essence, a basic MCP Server.
Now, if someone were to write an MCP Server that used an LLM (via the LLM Client 'sampling' feature) to consume an OpenAPI Server Spec and convert it into MCP Tools dynamically, THAT would be cool. Basically a dynamic self-coding MCP Server.
Conversely, it allows many different LLMs to get context via many different Applications using a standard prodocol.
It addresses an m*n problem.
You write a wrapper ("MCP server") over your docs/apis/databases/sites/scripts that exposes certain commands ("tools"), and you can instruct models to query your wrapper with these commands ("calling/invoking tools") and expect responses in a certain format that they can then use.
That is it.
Why vibe-coded? Because instead of bi-directional websockets the protocol uses unidirectional server-side events, so you need to send requests to a separate endpoint and then listen to the SSE hoping for an answer. There's also non-existent authentication.
The protocol could easily be transported over websockets. Heck, since stdio is one transport, you could simply pipe that over websockets. Of course, that leaves a massive gap around authn and authz.
The Streamable HTTP transport includes an authentication workflow using OAuth. Of course, that only addresses part of the issue.
There are many flaws that need improvement in MCP, but railing against the current transports by using a presumably denigratory term ("vibe-coded") isn't helpful.
Your "that is it" stops at talking about one single aspect of the protocol. On the server side you left out resources and prompts. On the client side you left out sampling, which I find to be a very interesting possibility.
I think MCP has many warts that need addressing. I also think it's a good start on a way to standardize connections between tools and agents.
but that doesnt have to be necessarily negative
Awful case of "not invented here" syndrome
I'm personally interested in if WebTransport could be the basis for something better
I like this succinct explanation.
https://en.wikipedia.org/wiki/List_of_Tron_characters#Master...
But to save you the click & read: it's OpenAPI for LLMs
Before the whole "just use OpenAPI" crowd arrives, the point is that LLMs work better with curated context. An OpenAPI server not designed for that will quickly flood an LLM context window.
We are already off to a wrong start, context has a meaning specific to LLMs, everyone who works with LLMs knows what it means: the context is the text that is fed as input at runtime to LLM, including the current message (user prompt) as well as the previous messages and responses by the LLM.
So we don't need to read any further and we can ignore this article, and MCPs by extension, YAGNI
As you yourself say, the context is the text that is fed as input at runtime to an LLM. This text could just always come from the user as a prompt, but that's a pretty lousy interface to try to cram everything that you might want the model to know about, and it puts the onus entirely on the user to figure out what might be relevant context. The premise of the Model Context Protocol (MCP) is overall sound: how do we give the "Model" access to load arbitrary details into "Context" from many different sources?
This is a real problem worth solving and it has everything to do with the technical meaning of the word "context" in this context. I'm not sure why you dismiss it so abruptly.
Agent LLMs are able to retrieve additional context and MCP servers give them specific, targeted tools to do so.