Skip to main content
Glama

2025-autumn-mcp

Project Background

This project is about learning how to turn basic data science skills into real, usable services. Rather than running code in isolation, you’ll package text analysis tools into a Model Context Protocol (MCP) server that can be consumed by any MCP-aware client, including modern AI assistants. Along the way you’ll learn how to design structured inputs and outputs (schemas), containerize and run services with Docker, and expose your work in a way that others — whether researchers, policymakers, or fellow students — could immediately integrate into their own workflows. The goal is not to build the most advanced NLP system, but to see how small, well-defined analytics can be made reusable, composable, and sharable across disciplines.

Goals

This sprint focuses on learning the Model Context Protocol by building a text-analysis MCP server.

What You'll Build:

  • An MCP server with baseline text-analysis tools (group work)

  • Your own custom MCP tool relevant to your field (individual work on a feature branch)

Using Python, Pydantic schemas, and FastMCP, you'll gain experience with natural language processing techniques (TF-IDF, sentiment analysis, readability metrics), structured data exchange, and service-oriented design.

Deliverables:

  • Working baseline MCP server with corpus_answer and text_profile tools

  • Your custom tool on a feature branch with tests

  • Demo showing your tool in action

  • Documentation explaining your tool's domain application

Project Structure

mcp/ ├── data/ │ └── corpus/ # Your text corpus (.txt files) │ ├── climate_policy.txt │ ├── urban_planning.txt │ ├── ai_ethics.txt │ └── public_health.txt ├── notebooks/ │ └── MCP_Introduction.ipynb # Interactive tutorial ├── src/ │ ├── utils/ # Utility code from week 1 │ └── mcp_server/ # MCP server implementation │ ├── __init__.py │ ├── server.py # Main FastMCP server │ ├── schemas.py # Pydantic data models │ ├── config/ │ │ ├── __init__.py │ │ └── settings.py # Configuration settings │ └── tools/ │ ├── __init__.py │ ├── corpus_answer.py # Document search tool │ └── text_profile.py # Text analysis tool ├── tests/ │ └── mcp_server/ # Tests for MCP tools │ ├── test_corpus_answer.py │ └── test_text_profile.py ├── pyproject.toml # Python dependencies └── README.md

Introduction & Setup

Getting Started:

  • Review the demonstration notebook: notebooks/MCP_Introduction.ipynb

  • Read about MCP:

  • Skim this page on Pydantic

  • Complete the Quick Start below to set up your environment

Phase 1: Group Work - Baseline MCP Server

Part 1: Schemas & Text Analysis Foundations

Objectives (Complete together as a group):

  • Understand Pydantic schemas and data validation

  • Learn TF-IDF basics for document search

  • Set up a shared corpus

  • Understand MCP tool design patterns

Tasks:

  1. Complete the notebook notebooks/MCP_Introduction.ipynb

    • Build your first MCP tool

    • Work with TF-IDF for document search

    • Define Pydantic schemas

    • Register tools with FastMCP

  2. Create a shared corpus

    • Add 3-5 .txt files to data/corpus/

    • Sample documents provided: climate policy, urban planning, AI ethics, public health

    • Choose documents that demonstrate the tools' capabilities

  3. Review the provided code structure in src/mcp_server/

    • schemas.py - Pydantic models for tool inputs/outputs

    • tools/corpus_answer.py - Document search skeleton

    • tools/text_profile.py - Text analytics skeleton

    • server.py - Main MCP server application

Deliverable: Completed notebook and shared corpus

Part 2: Baseline Tool Implementation

Objectives (Implement together as a group):

  • Implement the corpus_answer tool with TF-IDF search

  • Implement the text_profile tool with text analytics

  • Test the baseline implementation

Tasks:

  1. Implement (src/mcp_server/tools/corpus_answer.py)

    Complete the TODOs in:

    • _load_corpus() - Load .txt files from the corpus directory

    • _ensure_index() - Build TF-IDF index from documents

    • _synthesize_answer() - Create concise answer snippets

    • corpus_answer() - Main search and ranking logic

    Key steps:

    • Load all .txt files from data/corpus/

    • Build TF-IDF vectorizer with appropriate parameters

    • Transform query and compute cosine similarity

    • Return top 3-5 results with snippets and scores

  2. Implement (src/mcp_server/tools/text_profile.py)

    Complete the TODOs in:

    • _read_doc() - Read document by ID from corpus

    • _tokenize() - Extract words from text

    • _flesch_reading_ease() - Calculate readability score

    • _top_terms() - Extract keywords using TF-IDF

    • text_profile() - Compute all text features

    Features to calculate:

    • Character and word counts

    • Type-token ratio (lexical diversity)

    • Flesch Reading Ease score

    • VADER sentiment analysis

    • Top n-grams and keywords

  3. Test your tools

    # Run tests make test # Test specific tool uv run pytest tests/mcp_server/test_corpus_answer.py -v
  4. Debug and refine

    • Use logging to debug

    • Test with different queries and documents

    • Ensure all tests pass

Deliverable: Working baseline server with corpus_answer and text_profile tools


Phase 2: Individual Work - Custom Tool Development

Creating Your Own MCP Tool

Now that you understand MCP fundamentals, each student will create their own custom tool on a feature branch.

Objectives (Individual work):

  • Apply MCP concepts to your own field or interests

  • Design and implement a non-trivial tool

  • Write tests for your tool

  • Demonstrate domain-specific application

Tasks:

  1. Create your feature branch

    git checkout -b student/my-custom-tool
  2. Design your tool

    Choose a tool relevant to your field or interests. Examples:

    • Policy analysis: Extract policy recommendations from documents

    • Data science: Statistical analysis or data transformation tool

    • Research: Literature review summarization or citation extraction

    • Education: Readability adaptation or concept explanation

    • Healthcare: Medical terminology extraction or symptom checking

    • Environmental: Climate data analysis or carbon footprint calculation

    Your tool should:

    • Be non-trivial (more complex than a simple calculation)

    • Have a clear use case in your domain

    • Use Pydantic schemas for inputs/outputs

    • Return structured, useful data

  3. Implement your tool

    Create src/mcp_server/tools/my_tool_name.py:

    from pydantic import BaseModel, Field class MyToolInput(BaseModel): """Input schema for my tool.""" # Define your inputs class MyToolOutput(BaseModel): """Output schema for my tool.""" # Define your outputs def my_tool(input: MyToolInput) -> MyToolOutput: """Your tool implementation.""" # Your logic here
  4. Register your tool in src/mcp_server/server.py:

    from mcp_server.tools.my_tool_name import my_tool, MyToolInput, MyToolOutput @mcp.tool def my_tool_tool(input: MyToolInput) -> MyToolOutput: """My custom tool description.""" return my_tool(input)
  5. Write tests in tests/mcp_server/test_my_tool.py:

    def test_my_tool(): result = my_tool(MyToolInput(...)) assert result.some_field == expected_value
  6. Test and document

    • Run make test to verify tests pass

    • Run uv run python tests/manual_server_test.py to test end-to-end

    • Document your tool's purpose and usage in comments

Deliverable: Working custom tool with tests on your feature branch


Demo & Presentation

Objectives:

  • Demonstrate your custom tool in action

  • Show how it applies MCP concepts to your domain

  • Present test results

  • Reflect on real-world applications

Tasks:

  1. Test your server

    Option A: Quick test (validate tools work)

    make run-interactive uv run pytest tests/manual_server_test.py -v

    Option B: MCP Inspector (full protocol test)

    # Terminal 1: Start server make run-interactive uv run python -m mcp_server.server # Terminal 2: Run Inspector on HOST (not in container) npx @modelcontextprotocol/inspector # Choose: STDIO transport, command: ./run_mcp_server.sh

    See notebooks/MCP_Introduction.ipynb for complete Inspector setup instructions.

  2. Prepare your demo presentation

    Your demo should show:

    • All three tools: Baseline tools (corpus_answer, text_profile) + your custom tool

    • Your custom tool in depth:

      • What problem it solves in your domain

      • Example inputs and outputs

      • How the Pydantic schemas are designed

      • Test results proving it works

    • Real-world application: How someone in your field would actually use this tool

  3. Write documentation for your custom tool In your tool file or a separate doc, explain:

    • What problem your tool solves

    • How to use it (with examples)

    • Design decisions (why this schema? why this approach?)

    • Potential applications in your field

    • Limitations and future improvements

  4. Reflection questions (for your documentation)

    • How does your tool address a real need in your domain?

    • What challenges did you face in implementing it?

    • How could it be extended or improved?

    • How might it integrate with other tools or systems?

Final Deliverable:

  • Feature branch with your custom tool

  • Passing test suite

  • Documentation explaining your tool and its domain application

Quick Start

Note: The corpus files are included in the repository at data/corpus/. You can modify or add to them for your project.

Option A: Using VS Code/Cursor (Recommended)

If you're using VS Code or Cursor, you can use the devcontainer:

# Prepare the devcontainer make devcontainer # Then in VS Code/Cursor: # - Command Palette (Cmd/Ctrl+Shift+P) # - Select "Dev Containers: Reopen in Container"

Option B: Using Make Commands

# Build the Docker image make build-only # Test that everything works make test

Technical Expectations

Prerequisites

We use Docker, Make, uv, and Node.js as part of our curriculum. If you are unfamiliar with them, it is strongly recommended you read over the following:

Required on your HOST machine:

  • Docker: An introduction to Docker

  • Make: Usually pre-installed on macOS/Linux. Windows users: install via Chocolatey or use WSL

  • Node.js: Required for MCP Inspector testing tool

    • Install from nodejs.org (LTS version)

    • Or use package manager: brew install node (macOS), apt install nodejs npm (Ubuntu)

    • Verify: node --version should show v18.x or higher

Inside the Docker container:

Container-Based Development

All code must be run inside the Docker container. This ensures consistent environments across different machines and eliminates "works on my machine" issues.

Environment Management with uv

We use uv for Python environment and package management inside the container. uv handles:

  • Virtual environment creation and management (replaces venv/pyenv)

  • Package installation and dependency resolution (replaces pip)

  • Project dependency management via pyproject.toml

Important: When running Python code, prefix commands with uv run to maintain the proper environment:

Usage & Testing

Running Tests

# Run all pytest tests make test # Run specific test file make run-interactive uv run pytest tests/mcp_server/test_corpus_answer.py -v

Docker & Make

We use docker and make to run our code. Common make commands:

  • make build-only: Build the Docker image only

  • make run-interactive: Start an interactive bash session in the container

  • make test: Run all tests with pytest

  • make devcontainer: Build and prepare devcontainer for VS Code/Cursor

  • make clean: Clean up Docker images and containers

The file Makefile contains details about the specific commands that are run when calling each make target.

Additional Resources

MCP and FastMCP

Text Analysis Libraries

Reference Implementation

  • Review notebooks/MCP_Introduction.ipynb for interactive examples

Style

We use ruff to enforce style standards and grade code quality. This is an automated code checker that looks for specific issues in the code that need to be fixed to make it readable and consistent with common standards. ruff is run before each commit via pre-commit. If it fails, the commit will be blocked and the user will be shown what needs to be changed.

To run pre-commit inside the container:

pre-commit install pre-commit run --all-files

You can also run ruff directly:

ruff check ruff format
-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcmurtrya/ukraine-war-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server