Skip to main content
Glama
bitgeese

Sequential Questioning MCP Server

by bitgeese
architecture.md3.66 kB
# Sequential Questioning MCP Server Architecture ## System Overview The Sequential Questioning MCP Server is designed to provide contextually relevant questions based on conversation history. The system uses a combination of LLMs (Large Language Models) and vector embeddings to generate intelligent follow-up questions that maintain context across a conversation. ## Core Components ### 1. API Layer - **FastAPI Application**: The main entry point for the application, handling HTTP requests and responses. - **MCP Endpoints**: Custom endpoints that follow the MCP (Model-Controlled Processing) pattern for structured question generation. - **Internal Endpoints**: Endpoints for monitoring and administration. ### 2. Service Layer - **Question Generation Service**: Core business logic for generating initial and follow-up questions. - **Vector Database Service**: Integration with Qdrant for storing and retrieving vector embeddings. ### 3. Repository Layer - **User Session Repository**: Manages user session data and persistence. - **Conversation Repository**: Handles conversation storage and retrieval. - **Message Repository**: Manages individual message data. ### 4. Model Layer - **Domain Models**: SQLAlchemy models representing core business entities. - **Data Transfer Objects**: Pydantic schemas for request/response validation. ### 5. Core Infrastructure - **Database**: SQLite for development, with SQLAlchemy as the ORM. - **Vector Database**: Qdrant for storing and searching vector embeddings. - **Monitoring**: Custom metrics collection and reporting. ## Data Flow 1. **Question Generation Flow**: - Client sends a request to the MCP endpoint - Request is validated and processed - Question generation service is invoked - For follow-up questions, previous context is retrieved and enhanced with vector database - LLM generates a question based on context - Response is formatted and returned to the client 2. **Data Persistence Flow**: - User sessions, conversations, and messages are stored in SQLite - Vector embeddings are stored in Qdrant - Repositories handle the data access logic ## Component Interaction Diagram ``` Client → FastAPI → MCP Endpoint → Question Generation Service → LLM ↑ ↓ SQLite ← Repositories ← Services → Vector DB Service → Qdrant ``` ## Monitoring and Observability The system includes a comprehensive monitoring solution: - **Request tracking**: Count of requests per endpoint - **Error tracking**: Error rates and types - **Performance metrics**: Response times, percentiles - **API metrics endpoint**: Exposes collected metrics via REST API ## Deployment Architecture The application is containerized using Docker: - **Application Container**: Runs the FastAPI application - **Vector Database Container**: Runs Qdrant - **Docker Compose**: Orchestrates the containers and networking ## Security Considerations - **Input Validation**: All input is validated using Pydantic schemas - **Error Handling**: Comprehensive error handling prevents information leakage - **Dependency Injection**: Uses FastAPI's dependency injection for clean, testable code ## Extensibility Points The architecture is designed for extensibility: - **Custom Question Generators**: Additional question generation strategies can be implemented - **Alternative Vector Databases**: The vector database service can be adapted for different providers - **Pluggable LLMs**: Different language models can be used for question generation

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bitgeese/sequential-questioning'

If you have feedback or need assistance with the MCP directory API, please join our Discord server