Skip to main content
Glama

MCP Server

🧩 Main modules

  1. Main MCP server - the core of the system, performing the following functions:

  • Registry of connected servers

  • Routing requests between servers

  • Monitoring server status

  • Aggregation of information about available tools

  1. Specialized servers (connect to the main one):

  • Embedding server - working with vector representations of text

  • PDF extract server - conversion and extraction from PDF to Markdown

  • Reranker server - ranking text data

  • Qdrant server - managing vector collections

  • PostgreSQL server - executing SQL queries and schema inspection

  • LLM server - generating and streaming LLM responses, list of models

  • MarkUp server - text/file markup using markup service methods

  • Transcribe server - audio loading, status and transcription result

βš™οΈ Available tools

Server

Methods

Embedding server

embedding_generate, embedding_batch_generate, embedding_get_models, health_check

PDF extract server

document_convert_to_markdown, document_get_supported_formats, health_check

Reranker server

rerank_documents, health_check

Qdrant server

vector_create_collection, vector_get_collection_info, vector_upsert_points, vector_search, vector_delete_points, health_check

PostgreSQL server

postgres_execute_query, postgres_get_schema, postgres_create_table, postgres_insert_data, health_check

LLM server

llm_chat_completion, llm_get_models, llm_stream_completion, health_check

MarkUp server

markup_get_methods, markup_process_text, markup_process_file, health_check

Transcribe server

transcribe_audio, transcribe_get_status, transcribe_get_result, health_check

Note:

To add a new FastMCP server, you need to import it in the main_server.py file and place it in the MCP_SERVERS array, after which the main methods of main_server.py will have access to it.

Also a necessary requirement for FastMCP servers is the presence of the health_check method to check the state.

πŸ“‘ Main server methods

get_server_and_tools() # Get a list of all servers and tools router(server_name, tool_name, params) # Routing requests health_check_servers() # Checking the health of all servers

Setting up the environment

Create a .env file in the root of the project and put the following environment variables in it (the list corresponds to the use in the code):

# Main server MAIN_SERVER_API_KEY=... # Embedding server EMBEDDING_API_KEY=... EMBEDDING_URL=... EMBEDDING_MODEL_NAME=... EMBEDDING_URL_MODELS=... EMBEDDING_HEALTH_URL=... # PDF extract server PDF_EXTRACTOR_URL=... PDF_HEALTH_URL=... # Reranker server RERANK_URL=... RERANK_MODEL=... RERANK_HEALTH_URL=... # Qdrant server QDRANT_URL=... QDRANT_API_KEY=... QDRANT_HEALTH_CHECK_URL=... # PostgreSQL server POSTGRES_USER=... POSTGRES_PASSWORD=... POSTGRES_HOST=... POSTGRES_DB=... #LLM server LLM_SERVICE_API_KEY=... LLM_SERVICE_MODEL=... LLM_SERVICE_CHAT_COMPLETIONS_URL=... LLM_SERVICE_MODELS_URL=... LLM_SERVICE_COMPLETIONS_URL=... LLM_SERVICE_HEALTH_URL=... # MarkUp server MARKUP_API_KEY=... MARKUP_GET_METHODS_URL=... MARKUP_PROCESS_TEXT_URL=... MARKUP_PROCESS_FILE_URL=... MARKUP_HEALTH_CHECK_URL=... # Transcribe server TRANSCRIBE_API_KEY=... TRANSCRIBE_UPLOAD_AUDIO=... TRANSCRIBE_HEALTH_URL=...

Start the main server

Installation dependencies and creating a virtual environment

Before starting the server, it is recommended to create a virtual environment and install all dependencies from requirements.txt. Run the following commands in the terminal:

  1. Creating a virtual environment

python -m venv venv
  1. Activating the environment

./venv/Scripts/activate
  1. Installing dependencies

pip install -r requirements.txt

After setting up the environment, the server is started with the command

fastmcp run ./main_server.py:main_mcp_server --transport http

Running in Docker

  1. Build the image:

docker build -t mcp-main-server .
  1. Run the container, passing .env as environment variables:

docker run --rm -p 8000:8000 --env-file .env mcp-main-server

Configuring the server connection in Cursor

  1. Run the server using the command above

  2. Open the settings

  3. Add the MCP server configuration:

{ "mcpServers": { "main-registry": { "url": "http://localhost:8000/mcp/" }}}

Local MCP server: proxy_mcp_server

  • What is it: proxy MCP server that connects to the main registry (main_server) and forwards its methods, and provides a high-level pipeline for pre-preparing data for RAG.

  • Where is it: proxy_mcp_server/proxy_mcp_server.py

  • Available tools:

  • get_server_and_tools() β€” get a list of servers and their tools from the registry

  • router(server_name: str, tool_name: str, params: dict) β€” universal call router

  • preprocessing_data_for_rag(file_paths: List[str]) -> str β€” prepare PDF/texts and create a collection in Qdrant; returns the collection name

  • health_check_servers() β€” check if all services are available

Requirements:

  • main_server is running and accessible via URL (e.g. http://localhost:8000/mcp/).

  • Valid API key MAIN_SERVER_API_KEY (must match Authorization header in proxy_mcp_server.py).

  • Update url and headers.Authorization in config object inside proxy_mcp_server/proxy_mcp_server.py if necessary.

Connection in Cursor (example):

{ "mcpServers": { "proxy-server": { "command": "uv", "args": [ "run", "fastmcp", "run", "YOUR_PATH_TO/proxy_mcp_server/proxy_mcp_server.py:proxy_mcp_server"] }}}

Launch from terminal:

fastmcp run ./proxy_mcp_server/proxy_mcp_server.py:proxy_mcp_server

RAG inference: interactive launch

  • What is this: console assistant for asking questions to a collection of documents in Qdrant with additional ranking and generation of LLM response.

  • Location: rag_inference/RAG workflow.py

  • Preliminary environment variables: QDRANT_URL, QDRANT_API_KEY, RERANK_URL, RERANK_MODEL, LLM_SERVICE_CHAT_COMPLETIONS_URL, LLM_SERVICE_API_KEY, LLM_SERVICE_MODEL, EMBEDDING_URL, EMBEDDING_MODEL are used (described above in the settings section).

Run (Windows PowerShell):

python ".\rag_inference\RAG workflow.py" <collection_name>

Where <collection_name> is the name of the collection in Qdrant. It is convenient to get it in advance by calling the preprocessing_data_for_rag tool from proxy_mcp_server and passing a list of files to index; the method will return the name of the created collection.

Example:

python ".\rag_inference\RAG workflow.py" collection_for_rag_1
-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/redmadrobot-rnd/mcp-registry'

If you have feedback or need assistance with the MCP directory API, please join our Discord server