Roman`s notes

Benefits of Using MCP to Implement AI Chat Memory

18 April 2025

memory,AI,conversational agents

Benefits of Using MCP to Implement AI Chat Memory

Implementing memory for AI assistants or conversational AI tools remains a complex engineering challenge. Large Language Models (LLMs) like ChatGPT are stateless by design—they only retain knowledge up to their training cutoff and do not inherently remember past interactions. However, for a seamless and context-aware user experience, it’s crucial for AI chat tools to recall previous conversations, preferences, and relevant history.

To address this gap, different vendors have developed their own proprietary solutions for integrating memory. For example, OpenAI’s ChatGPT has built-in memory capabilities, and other platforms like Anthropic’s Claude (including the Claude Desktop application) offer similar features. Each of these implementations is unique, often tied closely to the platform’s internal architecture and APIs.

This fragmented landscape raises an important question: what if we had a standardized way to implement memory for AI assistants?

Model Context Protocol (MCP) was originally designed to provide a standard way to integrate external tools with large language models (LLMs). But this same concept could inspire a standardized approach to implementing memory in AI chat systems. Instead of inventing something entirely new, perhaps we can extend or repurpose MCP to serve this function as well.

Which MCP Server Transport is Better? Comparing STDIO and SSE

16 April 2025

In this post, I’d like to share some thoughts on the Model Context Protocol (MCP) and compare two types of server integration methods it supports—STDIO and SSE, especially from the security perspective.

Quick Recap: What is MCP?

Model Context Protocol (MCP) is a new standard for integrating external tools with AI chat applications. For example, you can add Google Search as an MCP server to Claude Desktop, allowing the LLM to perform live searches to improve its responses. In this case, Claude Desktop is the MCP Host.

There are two common types of MCP server transports:

STDIO Transport: The MCP server runs locally on the same machine as the MCP Host. Users download a small application (the MCP server), install it, and configure the MCP Host to communicate with it via standard input/output.
SSE Transport: The MCP server runs as a network service, typically on a remote server (but it can also be on localhost). It's essentially a special kind of website that the MCP Host connects to via Server-Sent Events (SSE).

Building MCP SSE Server to integrate LLM with external tools

15 April 2025

mcp,sse,llm,external-tools,linux,security,authorization-token

Building MCP SSE Server to integrate LLM with external tools

As large language models (LLMs) find real-world use, the need for flexible ways to connect them with external tools is growing. The Model Context Protocol (MCP) is an emerging standard for structured tool integration.

Most current tutorials focus on STDIO-based MCP servers (Standard Input/Output), which must run locally with the client. But MCP also supports SSE (Server-Sent Events), allowing remote, asynchronous communication over HTTP—ideal for scalable, distributed setups.

In this article, we'll show how to build an SSE-based MCP server to enable real-time interaction between an LLM and external tools.

For this example, I've chosen the "Execute any command on my Linux" tool as the backend for the MCP server. Once connected to an LLM, this setup enables the AI to interact with and manage a Linux instance directly.

Additionally, I'll demonstrate how to add a basic security layer by introducing authorization token support for interacting with the MCP server.

Implementing Authentication in a Remote MCP Server with SSE Transport

15 April 2025

mcp,llm,authentication,sse,server-sent-events,golang,python

Implementing Authentication in a Remote MCP Server with SSE Transport

Today, I want to show how Model Context Protocol (MCP) servers using SSE transport can be made secure by adding authentication.

I'll use the Authorization HTTP header to read a Bearer token. Generating the token itself is out of scope for this post, it is same as usual practices for web applications.

UPDATE. This post demonstrates how to auth MCP Servers created with Golang. I have the other blog post demonstrating similar solution made with Python Implementing Authentication in a Remote MCP Server with Python and FastMCP

To verify how this works, you’ll need an MCP host tool that supports SSE endpoints along with custom headers. Unfortunately, I couldn’t find any AI chat tools that currently support this. For example, Claude Desktop doesn’t, and I haven’t come across any others that do.

However, I’m hopeful that most AI chat tools will start supporting it soon — there’s really no reason not to. By the way, I shared my thoughts on how MCP could transform the web in this post.

MCP can have significant impact on habitual internet usage practices

13 April 2025

mcp,AI,llm,chatgpt,internet,web3,sse,server-sent events

MCP can have significant impact on habitual internet usage practices

Model Context Protocol (MCP) is now popular subject in discussions around AI and LLMs. It was designed to add a standard way to connect "external" tools to LLMs to make them more useful. Classic example is the "what is the weather in ..." too. Each AI chat tool could do this with own way. Now there is a standard and a plugin made for one Ai Chat system can work with others.

We can se burst of enthusiasm in implementig of MCP servers for everything. I expect this trend will grow. Especially usage of MCP servers with SSE transport. Implementing of MCP server with Server-Sent Events make it similar to SaaS server designed for LLM/AI tool as a client.

There are to reason i decided to write this artcile.

First. It is reported that internet users now often go to AI chat (often ChatGPT) to find something instead of going to google
Second. OpenAI anounced they will add support of MCP to ChatGPT Desktop soon. And they will add both STDIO and SSE transport protocols for MCP

Based on this i expect we will see some interesting changes soon.

"Tool calling" from LLM. Understanding hot it works

25 March 2025

mcp,AI,llm,chatgpt,internet,web3,sse,server-sent events

I am interested in learning how LLMs can understand requests requiring a "tool call".

In this post "Tool Calling" and Ollama, there is a nice description of how "Tool calling" works with Ollama.

The idea of this feature is that LLMs can have access to some tools (aka external APIs) and can call them to get extra information. To be able to do this, the LLM has to understand the current request, determine that this request could be forwarded to a tool, and parse the arguments.

Here is a shorter example of the code from the original article:

#!/bin/bash 
SERVICE_URL="http://localhost:11434"
read -r -d '' DATA <<- EOM
{
  "model": "llama3.1",
  "messages": [
    {
      "role": "user",
      "content": "This is Bob. We are doing math. Help us to add 2 and 3. BTW. Say hello to him"
    }
  ],
  "stream": false,
  "tools": [
    {
      "function": {