How do I use MCP LLM Integration Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@MCP LLM Integration Server summarize this article about climate change" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

MCP LLM Integration Server

This is a Model Context Protocol (MCP) server that allows you to integrate local LLM capabilities with MCP-compatible clients.

Features

llm_predict: Process text prompts through a local LLM
echo: Echo back text for testing purposes

Setup

Install dependencies:
source .venv/bin/activate uv pip install mcp
Test the server:
python -c " import asyncio from main import server, list_tools, call_tool async def test(): tools = await list_tools() print(f'Available tools: {[t.name for t in tools]}') result = await call_tool('echo', {'text': 'Hello!'}) print(f'Result: {result[0].text}') asyncio.run(test()) "

Integration with LLM Clients

For Claude Desktop

Add this to your Claude Desktop configuration (~/.config/claude-desktop/claude_desktop_config.json):

{ "mcpServers": { "llm-integration": { "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python", "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"] } } }

For Continue.dev

Add this to your Continue configuration (~/.continue/config.json):

{ "mcpServers": [ { "name": "llm-integration", "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python", "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"] } ] }

For Cline

Add this to your Cline MCP settings:

{ "llm-integration": { "command": "/home/tandoori/Desktop/dev/mcp-server/.venv/bin/python", "args": ["/home/tandoori/Desktop/dev/mcp-server/main.py"] } }

Customizing the LLM Integration

To integrate your own local LLM, modify the perform_llm_inference function in main.py:

async def perform_llm_inference(prompt: str, max_tokens: int = 100) -> str: Example: Using transformers from transformers import pipeline generator = pipeline('text-generation', model='your-model') result = generator(prompt, max_length=max_tokens) return result[0]['generated_text'] Example: Using llama.cpp python bindings from llama_cpp import Llama llm = Llama(model_path="path/to/your/model.gguf") output = llm(prompt, max_tokens=max_tokens) return output['choices'][0]['text'] Current placeholder implementation return f"Processed prompt: '{prompt}' (max_tokens: {max_tokens})"

Testing

Run the server directly to test JSON-RPC communication:

source .venv/bin/activate python main.py

Then send JSON-RPC requests via stdin:

{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {"protocolVersion": "2024-11-05", "capabilities": {}, "clientInfo": {"name": "test-client", "version": "1.0.0"}}}