Skip to main content
Glama

DocNav MCP Server

License Python 3.10+

DocNav is a Model Context Protocol (MCP) server which empowers LLM Agents to read, analyze, and manage lengthy documents intelligently, mimicking human-like comprehension and navigation capabilities.

Features

  • Document Navigation: Navigate through document sections, headings, and content structure

  • Content Extraction: Extract and summarize specific document sections

  • Search & Query: Find specific content within documents using intelligent search

  • Multi-format Support: Currently supports Markdown (.md) files, with planned support for PDF and other formats

  • MCP Integration: Seamless integration with MCP-compatible LLMs and applications

Related MCP server: DevDocs MCP

Architecture

DocNav follows a modular, extensible architecture:

  • Core MCP Server: Main server implementation using the MCP protocol

  • Document Processors: Pluggable processors for different file types

  • Navigation Engine: Handles document structure analysis and navigation

  • Content Extractors: Extract and format content from documents

  • Search Engine: Provides search and query capabilities across documents

Installation

Prerequisites

  • Python 3.10+

  • uv package manager

Setup

  1. Clone the repository:

git clone https://github.com/shenyimings/DocNav-MCP.git cd DocNav-MCP
  1. Install dependencies:

uv sync

Usage

Starting the MCP Server

uv run server.py

Connect to the MCP server

{ "mcpServers": { "docnav": { "command": "{{PATH_TO_UV}}", // Run `which uv` and place the output here "args": [ "--directory", "{{PATH_TO_SRC}}", "run", "server.py" ] } } }

Available Tools

  • load_document: Load a document for navigation and analysis

    • Args: file_path (path to document file)

    • Returns: Success message with auto-generated document ID

  • get_outline: Get document outline/table of contents

    • Args: doc_id (document identifier), max_depth (max heading depth, default 3)

    • Returns: Formatted document outline

    • Tip: Use first after loading a document to understand structure

  • read_section: Read content of a specific document section

    • Args: doc_id (document identifier), section_id (e.g., 'h1_0', 'h2_1')

    • Returns: Section content with subsections

  • search_document: Search for specific content within a document

    • Args: doc_id (document identifier), query (search term or phrase)

    • Returns: Formatted search results with context

  • navigate_section: Get navigation context for a section

    • Args: doc_id (document identifier), section_id (section to navigate to)

    • Returns: Navigation context with parent, siblings, children

  • list_documents: List all currently loaded documents

    • Returns: List of loaded documents with metadata

  • get_document_stats: Get statistics about a loaded document

    • Args: doc_id (document identifier)

    • Returns: Document statistics and structure info

  • remove_document: Remove a document from the navigator

    • Args: doc_id (document identifier)

    • Returns: Success or error message

Example Usage

# Load a document result = await tools.load_document("path/to/document.md") # Get document outline outline = await tools.get_outline(doc_id) # Get specific section content section = await tools.read_section(doc_id, section_id) # Search within document results = await tools.search_document(doc_id, "search query")

Development

Project Structure

docnav-mcp/ --- server.py # Main MCP server --- docnav/ ------- __init__.py # Package initialization ------- models.py # Data models ------- navigator.py # Document navigation engine ------- processors/ ------- __init__.py # Processor package ------- base.py # Base processor interface ------- markdown.py # Markdown processor --- tests/ ------- ... # Test files

Development Guidelines

See CLAUDE.md for detailed development guidelines including:

  • Code quality standards

  • Testing requirements

  • Package management with uv

  • Formatting and linting rules

Adding New Document Processors

  1. Create a new processor class inheriting from BaseProcessor

  2. Implement the required methods: can_process, process, extract_section, search

  3. Register the processor in the DocumentNavigator

  4. Add comprehensive tests

Running Tests

# Run all tests uv run tests/run_tests.py

Code Quality

# Format code uv run --frozen ruff format . # Check linting uv run --frozen ruff check . # Type checking uv run --frozen pyright

Roadmap

  • Complete Markdown processor implementation

  • Add PDF document support (PyMuPDF)

  • Improve test coverage and quality

  • Implement advanced search capabilities

  • Add document summarization features

  • Support for additional document formats (DOCX, TXT, etc.)

  • Performance optimizations for large documents

  • Caching mechanisms for frequently accessed documents

  • Add persistent storage for loaded documents

Contributing

  1. Fork the repository

  2. Create a feature branch

  3. Follow the development guidelines in CLAUDE.md

  4. Add tests for new functionality

  5. Submit a pull request

License

This project is licensed under the Apache-2.0 License - see the LICENSE file for details.

Support

For issues and questions:

  • Open an issue on GitHub

  • Check the documentation in CLAUDE.md

  • Review existing issues and discussions

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/shenyimings/DocNav-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server