Supports converting pandas DataFrames to Excel workbooks with formatting and enables advanced data manipulation and analysis capabilities.
KnowledgeBaseMCP
A powerful Model Context Protocol (MCP) server for extracting text content from various document formats including PDF, DOCX, PPTX, and XLSX files. This tool enables AI assistants like Claude to read and analyze document contents from your local knowledge base, and also create new Excel spreadsheets.
š Features
Document Reading
Multi-format support: Extract text from PDF, DOCX, PPTX, and XLSX files
Directory processing: Process entire directories of documents
Recursive scanning: Optionally scan subdirectories
File metadata: Get detailed information about document files
Error handling: Robust error handling with clear error messages
Async processing: Efficient asynchronous document processing
Excel Spreadsheet Creation
XLSX workbook creation: Create Excel files with multiple sheets
DataFrame support: Convert pandas DataFrames to Excel
Data formatting: Apply professional formatting and styling
Report generation: Create structured reports with summaries
Data appending: Add data to existing Excel files
Template support: Use predefined templates for consistent formatting
Integration
Easy integration: Simple setup with Claude Desktop
MCP protocol: Built on the Model Context Protocol standard
š Supported File Types
Reading Support
PDF (.pdf) - Portable Document Format (using pdfplumber)
DOCX (.docx) - Microsoft Word documents
PPTX (.pptx) - Microsoft PowerPoint presentations
XLSX (.xlsx) - Microsoft Excel spreadsheets (using openpyxl and pandas)
Writing Support
DOCX (.docx) - Create Word documents with formatting
XLSX (.xlsx) - Create Excel workbooks with multiple sheets, formatting, and charts
š ļø Installation
Prerequisites
Python 3.8 or higher
Claude Desktop application
Setup
Clone the repository
Install dependencies
Test the server
āļø Configuration
Claude Desktop Integration
Add this server to your Claude Desktop configuration file:
Windows: %APPDATA%\\Claude\\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Replace path/to/KnowledgeBaseMCP with your actual installation path.
šÆ Usage
Once configured, you can use these tools in Claude:
Available Tools
Document Reading Tools
extract_text_from_file
Extract text content from a single document file.
Parameters:
file_path(string): Path to the document file
extract_text_from_directory
Extract text content from all supported documents in a directory.
Parameters:
directory_path(string): Path to the directory containing documentsrecursive(boolean, optional): Whether to search subdirectories recursively
list_supported_files
List all supported document files in a directory with metadata.
Parameters:
directory_path(string): Path to the directory to scan
DOCX Creation Tools
create_docx_document
Create a new Word document with text content.
Parameters:
content(string): Document contentfile_path(string): Output file path (.docx extension)title(string, optional): Document title
create_structured_report
Create a structured Word report with formatting.
Parameters:
report_data(object): Report data structurefile_path(string): Output file path (.docx extension)
XLSX Creation Tools
create_xlsx_workbook
Create a new Excel workbook with multiple sheets.
Parameters:
data(object): Dictionary with sheet names as keys and data as valuesfile_path(string): Output file path (.xlsx extension)apply_formatting(boolean, optional): Apply default formatting
create_xlsx_from_dataframe
Create Excel workbook from pandas DataFrames.
Parameters:
dataframes(object): Dictionary with sheet names and DataFrame datafile_path(string): Output file path (.xlsx extension)include_index(boolean, optional): Include DataFrame index
append_to_xlsx
Append data to existing Excel workbook.
Parameters:
file_path(string): Path to existing XLSX filesheet_name(string): Target sheet namedata(any): Data to append (list, dict, or DataFrame)
create_xlsx_report
Create a formatted Excel report with multiple sections.
Parameters:
report_data(object): Report structure with title, description, and data sectionsfile_path(string): Output file path (.xlsx extension)
Example Usage in Claude
Reading Documents
Creating Excel Reports
Data Analysis and Export
Document Conversion
Claude will then use the MCP server to extract and analyze the content from your documents or create new Excel files as requested.
šļø Project Structure
š§ Development
Running Tests
Adding New File Formats
To add support for additional document formats:
Add the file extension to
SUPPORTED_EXTENSIONSinextractors.pyInstall the required library
Add the library check to
check_dependencies()Implement the extraction method (e.g.,
_extract_xlsx())Add the format handling to
extract_from_file()
Debugging
For debugging MCP connection issues:
Check Claude Desktop logs
Ensure the server starts without errors:
python launch_mcp.pyVerify the config file path and format
š¦ Dependencies
Core Dependencies
mcp>=0.9.0- Model Context Protocol framework
Document Reading
python-docx>=1.1.0- For DOCX file processingpdfplumber>=0.9.0- For PDF file processingpython-pptx>=0.6.23- For PPTX file processingopenpyxl>=3.1.0- For XLSX file reading/writingpandas>=2.0.0- For advanced data manipulation and analysis
Additional
lxml>=4.9.0- XML processing support
š¤ Contributing
Contributions are welcome! Please feel free to submit pull requests or open issues.
Fork the repository
Create your feature branch (
git checkout -b feature/AmazingFeature)Commit your changes (
git commit -m 'Add some AmazingFeature')Push to the branch (
git push origin feature/AmazingFeature)Open a Pull Request
š License
This project is licensed under the MIT License - see the LICENSE file for details.
š Acknowledgments
Built with the Model Context Protocol
Uses pdfplumber for PDF processing
Uses python-docx for Word documents
Uses python-pptx for PowerPoint presentations
š Support
If you encounter any issues or have questions, please open an issue on GitHub.
Made with ā¤ļø for the Claude AI community