Skip to main content
Glama
README.md14.6 kB
# 2025-autumn-mcp ## Project Background This project is about learning how to turn basic data science skills into real, usable services. Rather than running code in isolation, you’ll package text analysis tools into a Model Context Protocol (MCP) server that can be consumed by any MCP-aware client, including modern AI assistants. Along the way you’ll learn how to design structured inputs and outputs (schemas), containerize and run services with Docker, and expose your work in a way that others — whether researchers, policymakers, or fellow students — could immediately integrate into their own workflows. The goal is not to build the most advanced NLP system, but to see how small, well-defined analytics can be made reusable, composable, and sharable across disciplines. ## Goals This sprint focuses on learning the Model Context Protocol by building a text-analysis MCP server. **What You'll Build**: - An MCP server with baseline text-analysis tools (group work) - Your own custom MCP tool relevant to your field (individual work on a feature branch) Using Python, Pydantic schemas, and FastMCP, you'll gain experience with natural language processing techniques (TF-IDF, sentiment analysis, readability metrics), structured data exchange, and service-oriented design. **Deliverables**: - Working baseline MCP server with `corpus_answer` and `text_profile` tools - Your custom tool on a feature branch with tests - Demo showing your tool in action - Documentation explaining your tool's domain application ## Project Structure ``` mcp/ ├── data/ │ └── corpus/ # Your text corpus (.txt files) │ ├── climate_policy.txt │ ├── urban_planning.txt │ ├── ai_ethics.txt │ └── public_health.txt ├── notebooks/ │ └── MCP_Introduction.ipynb # Interactive tutorial ├── src/ │ ├── utils/ # Utility code from week 1 │ └── mcp_server/ # MCP server implementation │ ├── __init__.py │ ├── server.py # Main FastMCP server │ ├── schemas.py # Pydantic data models │ ├── config/ │ │ ├── __init__.py │ │ └── settings.py # Configuration settings │ └── tools/ │ ├── __init__.py │ ├── corpus_answer.py # Document search tool │ └── text_profile.py # Text analysis tool ├── tests/ │ └── mcp_server/ # Tests for MCP tools │ ├── test_corpus_answer.py │ └── test_text_profile.py ├── pyproject.toml # Python dependencies └── README.md ``` ## Introduction & Setup **Getting Started:** - Review the demonstration notebook: `notebooks/MCP_Introduction.ipynb` - Read about MCP: - [Intro](https://modelcontextprotocol.io/docs/getting-started/intro) - [Concepts](https://modelcontextprotocol.io/docs/learn/architecture#concepts-of-mcp) - [Primitives](https://modelcontextprotocol.io/docs/learn/architecture#primitives) - Skim this page on [Pydantic](https://realpython.com/python-pydantic/) - Complete the Quick Start below to set up your environment ## Phase 1: Group Work - Baseline MCP Server ### Part 1: Schemas & Text Analysis Foundations **Objectives (Complete together as a group):** - Understand Pydantic schemas and data validation - Learn TF-IDF basics for document search - Set up a shared corpus - Understand MCP tool design patterns **Tasks:** 1. **Complete the notebook** `notebooks/MCP_Introduction.ipynb` - Build your first MCP tool - Work with TF-IDF for document search - Define Pydantic schemas - Register tools with FastMCP 2. **Create a shared corpus** - Add 3-5 `.txt` files to `data/corpus/` - Sample documents provided: climate policy, urban planning, AI ethics, public health - Choose documents that demonstrate the tools' capabilities 3. **Review the provided code structure** in `src/mcp_server/` - `schemas.py` - Pydantic models for tool inputs/outputs - `tools/corpus_answer.py` - Document search skeleton - `tools/text_profile.py` - Text analytics skeleton - `server.py` - Main MCP server application **Deliverable:** Completed notebook and shared corpus ### Part 2: Baseline Tool Implementation **Objectives (Implement together as a group):** - Implement the `corpus_answer` tool with TF-IDF search - Implement the `text_profile` tool with text analytics - Test the baseline implementation **Tasks:** 1. **Implement `corpus_answer` tool** (`src/mcp_server/tools/corpus_answer.py`) Complete the TODOs in: - `_load_corpus()` - Load .txt files from the corpus directory - `_ensure_index()` - Build TF-IDF index from documents - `_synthesize_answer()` - Create concise answer snippets - `corpus_answer()` - Main search and ranking logic Key steps: - Load all .txt files from `data/corpus/` - Build TF-IDF vectorizer with appropriate parameters - Transform query and compute cosine similarity - Return top 3-5 results with snippets and scores 2. **Implement `text_profile` tool** (`src/mcp_server/tools/text_profile.py`) Complete the TODOs in: - `_read_doc()` - Read document by ID from corpus - `_tokenize()` - Extract words from text - `_flesch_reading_ease()` - Calculate readability score - `_top_terms()` - Extract keywords using TF-IDF - `text_profile()` - Compute all text features Features to calculate: - Character and word counts - Type-token ratio (lexical diversity) - Flesch Reading Ease score - VADER sentiment analysis - Top n-grams and keywords 3. **Test your tools** ```bash # Run tests make test # Test specific tool uv run pytest tests/mcp_server/test_corpus_answer.py -v ``` 4. **Debug and refine** - Use logging to debug - Test with different queries and documents - Ensure all tests pass **Deliverable:** Working baseline server with `corpus_answer` and `text_profile` tools --- ## Phase 2: Individual Work - Custom Tool Development ### Creating Your Own MCP Tool Now that you understand MCP fundamentals, **each student** will create their own custom tool on a feature branch. **Objectives (Individual work):** - Apply MCP concepts to your own field or interests - Design and implement a non-trivial tool - Write tests for your tool - Demonstrate domain-specific application **Tasks:** 1. **Create your feature branch** ```bash git checkout -b student/my-custom-tool ``` 2. **Design your tool** Choose a tool relevant to your field or interests. Examples: - **Policy analysis**: Extract policy recommendations from documents - **Data science**: Statistical analysis or data transformation tool - **Research**: Literature review summarization or citation extraction - **Education**: Readability adaptation or concept explanation - **Healthcare**: Medical terminology extraction or symptom checking - **Environmental**: Climate data analysis or carbon footprint calculation Your tool should: - Be **non-trivial** (more complex than a simple calculation) - Have a clear use case in your domain - Use Pydantic schemas for inputs/outputs - Return structured, useful data 3. **Implement your tool** Create `src/mcp_server/tools/my_tool_name.py`: ```python from pydantic import BaseModel, Field class MyToolInput(BaseModel): """Input schema for my tool.""" # Define your inputs class MyToolOutput(BaseModel): """Output schema for my tool.""" # Define your outputs def my_tool(input: MyToolInput) -> MyToolOutput: """Your tool implementation.""" # Your logic here ``` 4. **Register your tool** in `src/mcp_server/server.py`: ```python from mcp_server.tools.my_tool_name import my_tool, MyToolInput, MyToolOutput @mcp.tool def my_tool_tool(input: MyToolInput) -> MyToolOutput: """My custom tool description.""" return my_tool(input) ``` 5. **Write tests** in `tests/mcp_server/test_my_tool.py`: ```python def test_my_tool(): result = my_tool(MyToolInput(...)) assert result.some_field == expected_value ``` 6. **Test and document** - Run `make test` to verify tests pass - Run `uv run python tests/manual_server_test.py` to test end-to-end - Document your tool's purpose and usage in comments **Deliverable:** Working custom tool with tests on your feature branch --- ## Demo & Presentation **Objectives:** - Demonstrate your custom tool in action - Show how it applies MCP concepts to your domain - Present test results - Reflect on real-world applications **Tasks:** 1. **Test your server** **Option A: Quick test** (validate tools work) ```bash make run-interactive uv run pytest tests/manual_server_test.py -v ``` **Option B: MCP Inspector** (full protocol test) ```bash # Terminal 1: Start server make run-interactive uv run python -m mcp_server.server # Terminal 2: Run Inspector on HOST (not in container) npx @modelcontextprotocol/inspector # Choose: STDIO transport, command: ./run_mcp_server.sh ``` See `notebooks/MCP_Introduction.ipynb` for complete Inspector setup instructions. 2. **Prepare your demo presentation** Your demo should show: - **All three tools**: Baseline tools (`corpus_answer`, `text_profile`) + your custom tool - **Your custom tool in depth**: - What problem it solves in your domain - Example inputs and outputs - How the Pydantic schemas are designed - Test results proving it works - **Real-world application**: How someone in your field would actually use this tool 4. **Write documentation** for your custom tool In your tool file or a separate doc, explain: - What problem your tool solves - How to use it (with examples) - Design decisions (why this schema? why this approach?) - Potential applications in your field - Limitations and future improvements 5. **Reflection questions** (for your documentation) - How does your tool address a real need in your domain? - What challenges did you face in implementing it? - How could it be extended or improved? - How might it integrate with other tools or systems? **Final Deliverable:** - Feature branch with your custom tool - Passing test suite - Documentation explaining your tool and its domain application ## Quick Start **Note**: The corpus files are included in the repository at `data/corpus/`. You can modify or add to them for your project. ### Option A: Using VS Code/Cursor (Recommended) If you're using VS Code or Cursor, you can use the devcontainer: ```bash # Prepare the devcontainer make devcontainer # Then in VS Code/Cursor: # - Command Palette (Cmd/Ctrl+Shift+P) # - Select "Dev Containers: Reopen in Container" ``` ### Option B: Using Make Commands ```bash # Build the Docker image make build-only # Test that everything works make test ``` ## Technical Expectations ### Prerequisites We use Docker, Make, uv, and Node.js as part of our curriculum. If you are unfamiliar with them, it is strongly recommended you read over the following: **Required on your HOST machine:** - **Docker**: [An introduction to Docker](https://docker-curriculum.com/) - **Make**: Usually pre-installed on macOS/Linux. Windows users: install via [Chocolatey](https://chocolatey.org/) or use WSL - **Node.js**: Required for MCP Inspector testing tool - Install from [nodejs.org](https://nodejs.org/) (LTS version) - Or use package manager: `brew install node` (macOS), `apt install nodejs npm` (Ubuntu) - Verify: `node --version` should show v18.x or higher **Inside the Docker container:** - **uv**: [An introduction to uv](https://realpython.com/python-uv/) - for Python package management ### Container-Based Development **All code must be run inside the Docker container.** This ensures consistent environments across different machines and eliminates "works on my machine" issues. ### Environment Management with uv We use [uv](https://docs.astral.sh/uv/) for Python environment and package management _inside the container_. uv handles: - Virtual environment creation and management (replaces venv/pyenv) - Package installation and dependency resolution (replaces pip) - Project dependency management via `pyproject.toml` **Important**: When running Python code, prefix commands with `uv run` to maintain the proper environment: ## Usage & Testing ### Running Tests ```bash # Run all pytest tests make test # Run specific test file make run-interactive uv run pytest tests/mcp_server/test_corpus_answer.py -v ``` ### Docker & Make We use `docker` and `make` to run our code. Common `make` commands: * `make build-only`: Build the Docker image only * `make run-interactive`: Start an interactive bash session in the container * `make test`: Run all tests with pytest * `make devcontainer`: Build and prepare devcontainer for VS Code/Cursor * `make clean`: Clean up Docker images and containers The file `Makefile` contains details about the specific commands that are run when calling each `make` target. ## Additional Resources ### MCP and FastMCP - [Model Context Protocol Documentation](https://modelcontextprotocol.io/) - [FastMCP Getting Started](https://gofastmcp.com/) - [FastMCP GitHub](https://github.com/jlowin/fastmcp) - [Anthropic MCP Examples](https://github.com/anthropics/anthropic-quickstarts) ### Text Analysis Libraries - [Scikit-learn TF-IDF Documentation](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html) - [VADER Sentiment Analysis](https://github.com/cjhutto/vaderSentiment) - [Pydantic Documentation](https://docs.pydantic.dev/) ### Reference Implementation - Review `notebooks/MCP_Introduction.ipynb` for interactive examples ## Style We use [`ruff`](https://docs.astral.sh/ruff/) to enforce style standards and grade code quality. This is an automated code checker that looks for specific issues in the code that need to be fixed to make it readable and consistent with common standards. `ruff` is run before each commit via [`pre-commit`](https://pre-commit.com/). If it fails, the commit will be blocked and the user will be shown what needs to be changed. To run `pre-commit` inside the container: ```bash pre-commit install pre-commit run --all-files ``` You can also run `ruff` directly: ```bash ruff check ruff format ```

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcmurtrya/ukraine-war-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server