Search
Dinesh R Singh

Part 8: Agentic AI and Qdrant: Building semantic memory with MCP protocol

July 21, 2025

As Agentic AI systems evolve from reactive language models into structured thinkers, a new challenge emerges: how do we give these agents memory? Not just basic logs or static files, but real, searchable memory that understands and adapts to context over time.

This is where tools like Qdrant and the Model Context Protocol (MCP) come in—a modular pairing that brings semantic search and long-term knowledge storage into agent workflows. Together, they enable agents to not only recall relevant information but to reason across past experiences, making Agentic AI systems more intelligent, adaptive, and human-like in their decision-making.

Inspired by my Medium post, this article explores how MCP, the Model Context Protocol—a kind of connective tissue between LLMs and external tools or data sources—standardizes interactions between intelligent agents and vector databases like Qdrant. By enabling seamless storage and retrieval of embeddings, agents can now “remember” useful information and leverage it in future reasoning.

Let’s walk through the full architecture and code implementation of this cutting-edge combination.

LLMs + MCP + Database = Thoughtful Agentic AI

In Agentic AI, a language model doesn’t just generate — it thinks, acts, and reflects using external tools. That’s where MCP comes in.

Think of MCP as a “USB interface” for AI — it lets agents plug into tools like Qdrant, APIs, or structured databases using a consistent protocol.

Qdrant itself is a high-performance vector database — capable of powering semantic search, knowledge retrieval, and acting as long-term memory for AI agents. However, direct integration with agents can be messy and non-standardized.

This is solved by wrapping Qdrant inside an MCP server, giving agents a semantic API they can call like a function.

Architecture overview

[LLM Agent]
    |
    |-- [MCP Client]
[MCP Protocol]
    |
    |-- [Qdrant MCP Server]
    |   |-- Tool: qdrant-store
    |   |-- Tool: qdrant-find
    |
[Qdrant Vector DB]

Use case: Support ticket memory for AI assistants

Imagine an AI assistant answering support queries.

  • It doesn't have all answers built-in.
  • But it has semantic memory from prior support logs stored in Qdrant.
  • It uses qdrant-find to semantically retrieve similar issues .
  • It then formulates a contextual response.

Step-by-step implementation

Step 1: Launch Qdrant MCP Server

export COLLECTION_NAME="support-tickets"
export QDRANT_LOCAL_PATH="./qdrant_local_db"
export EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2"
uvx mcp-server-qdrant --transport sse

Key parameters:

  • COLLECTION_NAME: Name of the Qdrant collection
  • QDRANT_LOCAL_PATH: Local vector DB storage path
  • EMBEDDING_MODEL: Embedding model for vectorization

Step 2: Connect the MCP Client

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
async def main():
server_params = StdioServerParameters(
     command="uvx",
     args=\["mcp-server-qdrant"],
     env={
         "QDRANT_LOCAL_PATH": "./qdrant_local_db",
         "COLLECTION_NAME": "support-tickets",
         "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
     }
)
async with stdio_client(server_params) as (read, write):
     async with ClientSession(read, write) as session:
         await session.initialize()
         tools = await session.list_tools()
         print(tools)
Expected Output: Lists tools like qdrant-store, qdrant-find

Step 3: Ingest a new memory

ticket_info = "Order #1234 was delayed due to heavy rainfall in transit zone."
result = await session.call_tool("qdrant-store", arguments={
"information": ticket_info,
"metadata": {"order_id": 1234}
})

This stores an embedded version of the text in Qdrant.

query = "Why was order 1234 delayed?"
search_response = await session.call_tool("qdrant-find", arguments={
"query": "order 1234 delay"
})

Example output:

[
  {
"content": "Order #1234 was delayed due to heavy rainfall in transit zone.",
"metadata": {"order_id": 1234}
  }
]

Step 5: Use with LLM

import openai
context = "\n".join(\[r["content"] for r in search_response])
prompt = f"""
You are a helpful assistant. Use this context to answer:
"""
{context}
"""
Question: Why was order #1234 delayed?
"""
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": prompt}]
)
print(response["choices"][0]["message"]["content"])

Final answer:

"Order #1234 was delayed due to heavy rainfall in the transit zone."

Parameter reference

Tool
Parameter
Description
qdrant-storeinformationRaw string to embed
metadataOptional metadata for filtering
qdrant-findqueryNatural language query
env varEMBEDDING_MODELModel used to create embeddings
env varCOLLECTION_NAMEQdrant vector collection name

Pro tip: Chain MCP servers

You can deploy multiple MCP servers for different tools and plug them into agent workflows:

  • qdrant-find for memory
  • google-search for web data
  • postgres-query for structured facts

Then, orchestrate it all using Agentic AI Teams to perform high-level, multi-tool reasoning.

Final thoughts

By pairing Qdrant with MCP, Agentic AI gains powerful, semantic memory — a critical enabler of contextual understanding and long-term knowledge retention. This pattern abstracts the complexity of vector DBs behind a unified protocol, empowering agents to think, recall, and act without manual data plumbing.

As the AI stack modularizes further, approaches like this will form the backbone of scalable, pluggable, and intelligent multi-agent ecosystems.

Related

Daniel Fedorin

Experimenting with the Model Context Protocol and Chapel

Aug 28, 2025
Santosh Nagaraj, Isabelle Steinhauser

HPE Private Cloud AI: Natural Language to Structured Query Language

Apr 23, 2026
Dinesh R Singh

Part 3: Model Context Protocol (MCP): The protocol that powers AI agents

Jul 18, 2025
Anusha Y

Redefining storage operations with AI and MCP

Apr 19, 2026
BalaSubramanian Vetrivel

Building an MCP server to take advantage of OpsRamp monitoring - A Step-by-Step Implementation Guide Part 2

Feb 25, 2026
BalaSubramanian Vetrivel

Model Context Protocol (MCP): The universal connector for AI applications

Aug 12, 2025
Dinesh R Singh, Nisha Rajput, Varsha Shekhawat

From Gantt charts to Generative AI: How Agentic AI is revolutionizing project management

Aug 27, 2025
Dinesh R Singh, Nisha Rajput, Varsha Shekhawat

AI agents as the meeting whisperers

Sep 9, 2025

HPE Developer Newsletter

Stay in the loop.

Sign up for the HPE Developer Newsletter or visit the Newsletter Archive to see past content.

By clicking on “Subscribe Now”, I agree to HPE sending me personalized email communication about HPE and select HPE-Partner products, services, offers and events. I understand that my email address will be used in accordance with HPE Privacy Statement. You may unsubscribe from receiving HPE and HPE-Partner news and offers at any time by clicking on the Unsubscribe button at the bottom of the newsletter.

For more information on how HPE manages, uses, and protects your personal data please refer to HPE Privacy Statement.