Enables loading and working with documents from GitHub repositories, supporting document analysis and navigation through GitHub-hosted content.
Provides document navigation, content extraction, and search capabilities for Markdown files, allowing analysis and intelligent interaction with Markdown document structure and content.
DocNav MCP Server
DocNav is a Model Context Protocol (MCP) server which empowers LLM Agents to read, analyze, and manage lengthy documents intelligently, mimicking human-like comprehension and navigation capabilities.
Features
- Document Navigation: Navigate through document sections, headings, and content structure
- Content Extraction: Extract and summarize specific document sections
- Search & Query: Find specific content within documents using intelligent search
- Multi-format Support: Currently supports Markdown (.md) files, with planned support for PDF and other formats
- MCP Integration: Seamless integration with MCP-compatible LLMs and applications
Architecture
DocNav follows a modular, extensible architecture:
- Core MCP Server: Main server implementation using the MCP protocol
- Document Processors: Pluggable processors for different file types
- Navigation Engine: Handles document structure analysis and navigation
- Content Extractors: Extract and format content from documents
- Search Engine: Provides search and query capabilities across documents
Installation
Prerequisites
- Python 3.10+
- uv package manager
Setup
- Clone the repository:
- Install dependencies:
Usage
Starting the MCP Server
Connect to the MCP server
Available Tools
load_document
: Load a document for navigation and analysis- Args:
file_path
(path to document file) - Returns: Success message with auto-generated document ID
- Args:
get_outline
: Get document outline/table of contents- Args:
doc_id
(document identifier),max_depth
(max heading depth, default 3) - Returns: Formatted document outline
- Tip: Use first after loading a document to understand structure
- Args:
read_section
: Read content of a specific document section- Args:
doc_id
(document identifier),section_id
(e.g., 'h1_0', 'h2_1') - Returns: Section content with subsections
- Args:
search_document
: Search for specific content within a document- Args:
doc_id
(document identifier),query
(search term or phrase) - Returns: Formatted search results with context
- Args:
navigate_section
: Get navigation context for a section- Args:
doc_id
(document identifier),section_id
(section to navigate to) - Returns: Navigation context with parent, siblings, children
- Args:
list_documents
: List all currently loaded documents- Returns: List of loaded documents with metadata
get_document_stats
: Get statistics about a loaded document- Args:
doc_id
(document identifier) - Returns: Document statistics and structure info
- Args:
remove_document
: Remove a document from the navigator- Args:
doc_id
(document identifier) - Returns: Success or error message
- Args:
Example Usage
Development
Project Structure
Development Guidelines
See CLAUDE.md for detailed development guidelines including:
- Code quality standards
- Testing requirements
- Package management with uv
- Formatting and linting rules
Adding New Document Processors
- Create a new processor class inheriting from
BaseProcessor
- Implement the required methods:
can_process
,process
,extract_section
,search
- Register the processor in the
DocumentNavigator
- Add comprehensive tests
Running Tests
Code Quality
Roadmap
- Complete Markdown processor implementation
- Add PDF document support (PyMuPDF)
- Improve test coverage and quality
- Implement advanced search capabilities
- Add document summarization features
- Support for additional document formats (DOCX, TXT, etc.)
- Performance optimizations for large documents
- Caching mechanisms for frequently accessed documents
- Add persistent storage for loaded documents
Contributing
- Fork the repository
- Create a feature branch
- Follow the development guidelines in CLAUDE.md
- Add tests for new functionality
- Submit a pull request
License
This project is licensed under the Apache-2.0 License - see the LICENSE file for details.
Support
For issues and questions:
- Open an issue on GitHub
- Check the documentation in CLAUDE.md
- Review existing issues and discussions
This server cannot be installed
DocNav is a Model Context Protocol (MCP) server which empowers LLM Agents to read, analyze, and manage lengthy documents intelligently, mimicking human-like comprehension and navigation capabilities.
Available Tools
load_document
: Load a document for navigation and analysis- Args: `fi
Related MCP Servers
- AsecurityAlicenseAqualityA Model Context Protocol implementation that enables AI assistants to interact with markdown documentation files, providing capabilities for document management, metadata handling, search, and documentation health analysis.Last updated -1434611TypeScriptMIT License
- -securityAlicense-qualityA Model Context Protocol implementation that enables AI-powered access to documentation resources, featuring URI-based navigation, template matching, and structured documentation management.Last updated -3PythonMIT License
- -securityFlicense-qualityA customized MCP server that enables integration between LLM applications and documentation sources, providing AI-assisted access to LangGraph and Model Context Protocol documentation.Last updated -1Python
- -securityFlicense-qualityA Model Context Protocol server for ingesting, chunking and semantically searching documentation files, with support for markdown, Python, OpenAPI, HTML files and URLs.Last updated -Python