Skip to main content
Glama
raptor7197

MCP LLM Integration Server

by raptor7197

MCP LLM Integration Server

This is a Model Context Protocol (MCP) server that allows you to integrate local LLM capabilities with MCP-compatible clients.

Features

  • llm_predict: Process text prompts through a local LLM

  • echo: Echo back text for testing purposes

Setup

  1. Install dependencies:

    source .venv/bin/activate uv pip install mcp
  2. Test the server:

    python -c " import asyncio from main import server, list_tools, call_tool async def test(): tools = await list_tools() print(f'Available tools: {[t.name for t in tools]}') result = await call_tool('echo', {'text': 'Hello!'}) print(f'Result: {result[0].text}') asyncio.run(test()) "

Integration with LLM Clients

For Claude Desktop

Add this to your Claude Desktop configuration (~/.config/claude-desktop/claude_desktop_config.json):

{ "mcpServers": { "llm-integration": { "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python", "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"] } } }

For Continue.dev

Add this to your Continue configuration (~/.continue/config.json):

{ "mcpServers": [ { "name": "llm-integration", "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python", "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"] } ] }

For Cline

Add this to your Cline MCP settings:

{ "llm-integration": { "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python", "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"] } }

Customizing the LLM Integration

To integrate your own local LLM, modify the perform_llm_inference function in main.py:

async def perform_llm_inference(prompt: str, max_tokens: int = 100) -> str: Example: Using transformers from transformers import pipeline generator = pipeline('text-generation', model='your-model') result = generator(prompt, max_length=max_tokens) return result[0]['generated_text'] Example: Using llama.cpp python bindings from llama_cpp import Llama llm = Llama(model_path="path/to/your/model.gguf") output = llm(prompt, max_tokens=max_tokens) return output['choices'][0]['text'] Current placeholder implementation return f"Processed prompt: '{prompt}' (max_tokens: {max_tokens})"

Testing

Run the server directly to test JSON-RPC communication:

source .venv/bin/activate python main.py

Then send JSON-RPC requests via stdin:

{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "test-client", "version": "1.0.0"}}}
-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/raptor7197/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server