Skip to main content
Glama

MCP Server

🧩 Main modules

  1. Main MCP server - the core of the system, performing the following functions:

  • Registry of connected servers

  • Routing requests between servers

  • Monitoring server status

  • Aggregation of information about available tools

  1. Specialized servers (connect to the main one):

  • Embedding server - working with vector representations of text

  • PDF extract server - conversion and extraction from PDF to Markdown

  • Reranker server - ranking text data

  • Qdrant server - managing vector collections

  • PostgreSQL server - executing SQL queries and schema inspection

  • LLM server - generating and streaming LLM responses, list of models

  • MarkUp server - text/file markup using markup service methods

  • Transcribe server - audio loading, status and transcription result

⚙️ Available tools

Server

Methods

Embedding server

embedding_generate, embedding_batch_generate, embedding_get_models, health_check

PDF extract server

document_convert_to_markdown, document_get_supported_formats, health_check

Reranker server

rerank_documents, health_check

Qdrant server

vector_create_collection, vector_get_collection_info, vector_upsert_points, vector_search, vector_delete_points, health_check

PostgreSQL server

postgres_execute_query, postgres_get_schema, postgres_create_table, postgres_insert_data, health_check

LLM server

llm_chat_completion, llm_get_models, llm_stream_completion, health_check

MarkUp server

markup_get_methods, markup_process_text, markup_process_file, health_check

Transcribe server

transcribe_audio, transcribe_get_status, transcribe_get_result, health_check

Note:

To add a new FastMCP server, you need to import it in the main_server.py file and place it in the MCP_SERVERS array, after which the main methods of main_server.py will have access to it.

Also a necessary requirement for FastMCP servers is the presence of the health_check method to check the state.

📡 Main server methods

get_server_and_tools() # Get a list of all servers and tools
router(server_name, tool_name, params) # Routing requests
health_check_servers() # Checking the health of all servers

Setting up the environment

Create a .env file in the root of the project and put the following environment variables in it (the list corresponds to the use in the code):

# Main server
MAIN_SERVER_API_KEY=...

# Embedding server
EMBEDDING_API_KEY=...
EMBEDDING_URL=...
EMBEDDING_MODEL_NAME=...
EMBEDDING_URL_MODELS=...
EMBEDDING_HEALTH_URL=...

# PDF extract server
PDF_EXTRACTOR_URL=...
PDF_HEALTH_URL=...

# Reranker server
RERANK_URL=...
RERANK_MODEL=...
RERANK_HEALTH_URL=...

# Qdrant server
QDRANT_URL=...
QDRANT_API_KEY=...
QDRANT_HEALTH_CHECK_URL=...

# PostgreSQL server
POSTGRES_USER=...
POSTGRES_PASSWORD=...
POSTGRES_HOST=...
POSTGRES_DB=...

#LLM server
LLM_SERVICE_API_KEY=...
LLM_SERVICE_MODEL=...
LLM_SERVICE_CHAT_COMPLETIONS_URL=...
LLM_SERVICE_MODELS_URL=...
LLM_SERVICE_COMPLETIONS_URL=...
LLM_SERVICE_HEALTH_URL=...

# MarkUp server
MARKUP_API_KEY=...
MARKUP_GET_METHODS_URL=...
MARKUP_PROCESS_TEXT_URL=...
MARKUP_PROCESS_FILE_URL=...
MARKUP_HEALTH_CHECK_URL=...

# Transcribe server
TRANSCRIBE_API_KEY=...
TRANSCRIBE_UPLOAD_AUDIO=...
TRANSCRIBE_HEALTH_URL=...

Start the main server

Installation dependencies and creating a virtual environment

Before starting the server, it is recommended to create a virtual environment and install all dependencies from requirements.txt. Run the following commands in the terminal:

  1. Creating a virtual environment

python -m venv venv
  1. Activating the environment

./venv/Scripts/activate
  1. Installing dependencies

pip install -r requirements.txt

After setting up the environment, the server is started with the command

fastmcp run ./main_server.py:main_mcp_server --transport http

Running in Docker

  1. Build the image:

docker build -t mcp-main-server .
  1. Run the container, passing .env as environment variables:

docker run --rm -p 8000:8000 --env-file .env mcp-main-server

Configuring the server connection in Cursor

  1. Run the server using the command above

  2. Open the settings

  3. Add the MCP server configuration:

{
"mcpServers": {
"main-registry": {
   "url": "http://localhost:8000/mcp/"
}}}

Local MCP server: proxy_mcp_server

  • What is it: proxy MCP server that connects to the main registry (main_server) and forwards its methods, and provides a high-level pipeline for pre-preparing data for RAG.

  • Where is it: proxy_mcp_server/proxy_mcp_server.py

  • Available tools:

  • get_server_and_tools() — get a list of servers and their tools from the registry

  • router(server_name: str, tool_name: str, params: dict) — universal call router

  • preprocessing_data_for_rag(file_paths: List[str]) -> str — prepare PDF/texts and create a collection in Qdrant; returns the collection name

  • health_check_servers() — check if all services are available

Requirements:

  • main_server is running and accessible via URL (e.g. http://localhost:8000/mcp/).

  • Valid API key MAIN_SERVER_API_KEY (must match Authorization header in proxy_mcp_server.py).

  • Update url and headers.Authorization in config object inside proxy_mcp_server/proxy_mcp_server.py if necessary.

Connection in Cursor (example):

{
"mcpServers": {
"proxy-server": {
"command": "uv",
"args": [
"run",
"fastmcp",
"run",
"YOUR_PATH_TO/proxy_mcp_server/proxy_mcp_server.py:proxy_mcp_server"]
}}}

Launch from terminal:

fastmcp run ./proxy_mcp_server/proxy_mcp_server.py:proxy_mcp_server

RAG inference: interactive launch

  • What is this: console assistant for asking questions to a collection of documents in Qdrant with additional ranking and generation of LLM response.

  • Location: rag_inference/RAG workflow.py

  • Preliminary environment variables: QDRANT_URL, QDRANT_API_KEY, RERANK_URL, RERANK_MODEL, LLM_SERVICE_CHAT_COMPLETIONS_URL, LLM_SERVICE_API_KEY, LLM_SERVICE_MODEL, EMBEDDING_URL, EMBEDDING_MODEL are used (described above in the settings section).

Run (Windows PowerShell):

python ".\rag_inference\RAG workflow.py" <collection_name>

Where <collection_name> is the name of the collection in Qdrant. It is convenient to get it in advance by calling the preprocessing_data_for_rag tool from proxy_mcp_server and passing a list of files to index; the method will return the name of the created collection.

Example:

python ".\rag_inference\RAG workflow.py" collection_for_rag_1
-
security - not tested
A
license - permissive license
-
quality - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/redmadrobot-rnd/mcp-registry'

If you have feedback or need assistance with the MCP directory API, please join our Discord server