MCP 프롬프트 테스터

에이전트가 다양한 공급업체의 LLM 프롬프트를 테스트할 수 있도록 해주는 간단한 MCP 서버입니다.

특징

OpenAI 및 Anthropic 모델을 사용한 테스트 프롬프트
시스템 프롬프트, 사용자 프롬프트 및 기타 매개변수 구성
형식화된 응답이나 오류 메시지를 받으세요
.env 파일 지원을 통한 간편한 환경 설정

설치

지엑스피1

API 키 설정

서버에는 사용하려는 공급자의 API 키가 필요합니다. 다음 두 가지 방법으로 설정할 수 있습니다.

옵션 1: 환경 변수

다음 환경 변수를 설정하세요.

OPENAI_API_KEY - OpenAI API 키
ANTHROPIC_API_KEY - Anthropic API 키

옵션 2: .env 파일(권장)

프로젝트 디렉토리 또는 홈 디렉토리에 .env 라는 이름의 파일을 만듭니다.
다음 형식으로 API 키를 추가하세요.

OPENAI_API_KEY=your-openai-api-key-here
ANTHROPIC_API_KEY=your-anthropic-api-key-here

서버는 이러한 키를 자동으로 감지하고 로드합니다.

편의상 샘플 템플릿이 .env.example 로 포함되었습니다.

용법

stdio(기본값) 또는 SSE 전송을 사용하여 서버를 시작합니다.

# Using stdio transport (default)
prompt-tester

# Using SSE transport on custom port
prompt-tester --transport sse --port 8000

사용 가능한 도구

서버는 MCP 지원 에이전트를 위해 다음 도구를 제공합니다.

1. 목록 공급자

사용 가능한 LLM 공급자와 기본 모델을 검색합니다.

매개변수:

필요 없음

응답 예시:

{
  "providers": {
    "openai": [
      {
        "type": "gpt-4",
        "name": "gpt-4",
        "input_cost": 0.03,
        "output_cost": 0.06,
        "description": "Most capable GPT-4 model"
      },
      // ... other models ...
    ],
    "anthropic": [
      // ... models ...
    ]
  }
}

2. 테스트_비교

여러 프롬프트를 나란히 비교하여 다양한 공급자, 모델 및 매개변수를 동시에 테스트할 수 있습니다.

매개변수:

comparisons (배열): 1~4개의 비교 구성 목록, 각 구성에는 다음이 포함됩니다.
- provider (문자열): 사용할 LLM 공급자("openai" 또는 "anthropic")
- model (문자열): 모델 이름
- system_prompt (문자열): 시스템 프롬프트(모델에 대한 지침)
- user_prompt (문자열): 사용자의 메시지/프롬프트
- temperature (숫자, 선택 사항): 무작위성을 제어합니다.
- max_tokens (정수, 선택 사항): 생성할 최대 토큰 수
- top_p (숫자, 선택 사항): 핵 샘플링을 통해 다양성을 제어합니다.

사용 예:

{
  "comparisons": [
    {
      "provider": "openai",
      "model": "gpt-4",
      "system_prompt": "You are a helpful assistant.",
      "user_prompt": "Explain quantum computing in simple terms.",
      "temperature": 0.7
    },
    {
      "provider": "anthropic",
      "model": "claude-3-opus-20240229",
      "system_prompt": "You are a helpful assistant.",
      "user_prompt": "Explain quantum computing in simple terms.",
      "temperature": 0.7
    }
  ]
}

3. 멀티턴 대화 테스트

LLM 제공자와의 다중 턴 대화를 관리하여 상태 저장 대화를 만들고 유지할 수 있습니다.

모드:

start : 새로운 대화를 시작합니다
continue : 기존 대화를 계속합니다.
get : 대화 내역을 검색합니다.
list : 모든 활성 대화를 나열합니다
close : 대화를 닫습니다

매개변수:

mode (문자열): 작업 모드("시작", "계속", "가져오기", "목록" 또는 "닫기")
conversation_id (문자열): 대화의 고유 ID(계속, 가져오기, 닫기 모드에 필요)
provider (문자열): LLM 공급자(시작 모드에 필요)
model (문자열): 모델 이름(시작 모드에 필요)
system_prompt (문자열): 시스템 프롬프트(시작 모드에 필요)
user_prompt (문자열): 사용자 메시지(시작 및 계속 모드에서 사용됨)
temperature (숫자, 선택 사항): 모델의 온도 매개변수
max_tokens (정수, 선택 사항): 생성할 최대 토큰 수
top_p (숫자, 선택 사항): Top-p 샘플링 매개변수

예시 사용(대화 시작하기):

{
  "mode": "start",
  "provider": "openai",
  "model": "gpt-4",
  "system_prompt": "You are a helpful assistant specializing in physics.",
  "user_prompt": "Can you explain what dark matter is?"
}

예시 사용 (대화 계속):

{
  "mode": "continue",
  "conversation_id": "conv_12345",
  "user_prompt": "How does that relate to dark energy?"
}

에이전트에 대한 사용 예

MCP 클라이언트를 사용하면 에이전트는 다음과 같은 도구를 사용할 수 있습니다.

import asyncio
import json
from mcp.client.session import ClientSession
from mcp.client.stdio import StdioServerParameters, stdio_client

async def main():
    async with stdio_client(
        StdioServerParameters(command="prompt-tester")
    ) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            
            # 1. List available providers and models
            providers_result = await session.call_tool("list_providers", {})
            print("Available providers and models:", providers_result)
            
            # 2. Run a basic test with a single model and prompt
            comparison_result = await session.call_tool("test_comparison", {
                "comparisons": [
                    {
                        "provider": "openai",
                        "model": "gpt-4",
                        "system_prompt": "You are a helpful assistant.",
                        "user_prompt": "Explain quantum computing in simple terms.",
                        "temperature": 0.7,
                        "max_tokens": 500
                    }
                ]
            })
            print("Single model test result:", comparison_result)
            
            # 3. Compare multiple prompts/models side by side
            comparison_result = await session.call_tool("test_comparison", {
                "comparisons": [
                    {
                        "provider": "openai",
                        "model": "gpt-4",
                        "system_prompt": "You are a helpful assistant.",
                        "user_prompt": "Explain quantum computing in simple terms.",
                        "temperature": 0.7
                    },
                    {
                        "provider": "anthropic",
                        "model": "claude-3-opus-20240229",
                        "system_prompt": "You are a helpful assistant.",
                        "user_prompt": "Explain quantum computing in simple terms.",
                        "temperature": 0.7
                    }
                ]
            })
            print("Comparison result:", comparison_result)
            
            # 4. Start a multi-turn conversation
            conversation_start = await session.call_tool("test_multiturn_conversation", {
                "mode": "start",
                "provider": "openai",
                "model": "gpt-4",
                "system_prompt": "You are a helpful assistant specializing in physics.",
                "user_prompt": "Can you explain what dark matter is?"
            })
            print("Conversation started:", conversation_start)
            
            # Get the conversation ID from the response
            response_data = json.loads(conversation_start.text)
            conversation_id = response_data.get("conversation_id")
            
            # Continue the conversation
            if conversation_id:
                conversation_continue = await session.call_tool("test_multiturn_conversation", {
                    "mode": "continue",
                    "conversation_id": conversation_id,
                    "user_prompt": "How does that relate to dark energy?"
                })
                print("Conversation continued:", conversation_continue)
                
                # Get the conversation history
                conversation_history = await session.call_tool("test_multiturn_conversation", {
                    "mode": "get",
                    "conversation_id": conversation_id
                })
                print("Conversation history:", conversation_history)

asyncio.run(main())

MCP 에이전트 통합

MCP를 사용하는 에이전트의 경우 통합이 간단합니다. 에이전트가 LLM 프롬프트를 테스트해야 하는 경우:

발견 : 에이전트는 list_providers 사용하여 사용 가능한 모델과 해당 기능을 발견할 수 있습니다.
간단한 테스트 : 빠른 테스트를 위해 단일 구성으로 test_comparison 도구를 사용하세요.
비교 : 에이전트가 다양한 프롬프트나 모델을 평가해야 하는 경우 여러 구성으로 test_comparison 사용할 수 있습니다.
상태 기반 상호작용 : 다중 턴 대화의 경우 에이전트는 test_multiturn_conversation 도구를 사용하여 대화를 관리할 수 있습니다.

이를 통해 에이전트는 다음을 수행할 수 있습니다.

가장 효과적인 구문을 찾으려면 프롬프트 변형을 테스트하세요.
특정 작업에 대한 다양한 모델을 비교하세요
여러 차례 대화에서 맥락을 유지하세요
온도 및 max_tokens와 같은 매개변수를 최적화합니다.
개발 중 토큰 사용량 및 비용 추적

구성

환경 변수를 사용하여 API 키와 선택적 추적 구성을 설정할 수 있습니다.

필수 API 키

OPENAI_API_KEY - OpenAI API 키
ANTHROPIC_API_KEY - Anthropic API 키

선택적 Langfuse 추적

이 서버는 LLM 호출 추적 및 관찰을 위해 Langfuse를 지원합니다. 다음 설정은 선택 사항입니다.

LANGFUSE_SECRET_KEY - Langfuse 비밀 키
LANGFUSE_PUBLIC_KEY - Langfuse 공개 키
LANGFUSE_HOST - Langfuse 인스턴스의 URL

Langfuse 추적 기능을 사용하지 않으려면 이 설정을 비워두세요.

This server cannot be installed

security - not tested

license - permissive license

quality - not tested

How are these scores calculated?

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

에이전트가 OpenAI와 Anthropic 모델에서 LLM 프롬프트를 테스트하고 비교할 수 있도록 하는 MCP 서버로, 단일 테스트, 나란히 비교, 여러 차례 대화를 지원합니다.

Related MCP Servers

Just Prompt
disler
A
security
F
license
A
quality
A lightweight MCP server that provides a unified interface to various LLM providers including OpenAI, Anthropic, Google Gemini, Groq, DeepSeek, and Ollama.
Last updated -
6
545
Python
MCP Anthropic Server
mystique920
-
security
-
license
-
quality
An MCP server that provides tools for interacting with Anthropic's prompt engineering APIs, allowing users to generate, improve, and templatize prompts based on task descriptions and feedback.
Last updated -
1
TypeScript
ISC License
A2A Client MCP Server
tesla0225
-
security
F
license
-
quality
An MCP server that enables LLMs to interact with Agent-to-Agent (A2A) protocol compatible agents, allowing for sending messages, tracking tasks, and receiving streaming responses.
Last updated -
24
TypeScript
Interactive MCP
ttommyth
A
security
A
license
A
quality
interactive-mcp
Last updated -
5
490
264
TypeScript
MIT License

View all related MCP servers

MCP Prompt Tester