# Directmedia MCP π
[](https://python.org)
[](https://github.com/jlowin/fastmcp)
[](README.md)
[](README.md)
[](README.md)
[](LICENSE)
[](https://github.com/sandraschi/directmedia-mcp)
**FastMCP 2.13+ server for accessing Directmedia Publishing "Digitale Bibliothek" - TEXT EXTRACTION WORKING!**
## π― Overview
The Directmedia Publishing "Digitale Bibliothek" was a pioneering German electronic book collection from the 1990s, containing extensive German literature and world literature. This MCP server provides programmatic access to these classic digital books.
### β
**BREAKTHROUGH: Text Extraction Working!**
**MISSION ACCOMPLISHED**: We successfully reversed the Directmedia TEXT.DKI format!
- **Discovery**: TEXT.DKI files contain **structured binary records**, not compressed data
- **Decompressor**: Working Python implementation extracts readable German text
- **Access**: 101 volumes of 1990s literature now programmatically accessible
- **Preservation**: Digital cultural heritage unlocked for modern use
**What was thought to be "compression" was actually a structured record format with 2-byte length headers!**
### π Collection Status
- **101 volumes** discovered (DB002-DB161, DBSK01-DBSK05, DBSO01-DBSO28)
- **~14GB** total content across all volumes
- **Proprietary binary format** from 1990s German publishing
- **Latin-1 encoding** with special characters for German texts
### β οΈ **Legal Requirement**
**You must legally purchase the Directmedia CD-ROMs to use this tool. See Legal Notice section below.**
### ποΈ Sample Volumes
| Volume ID | Title | Size | Content Type |
|-----------|-------|------|--------------|
| DB002 | Philosophie von Platon bis Nietzsche | 389MB | Philosophy |
| DB003 | Geschichte der Philosophie | 113MB | Philosophy History |
| DB004 | Goethe | 360MB | Literature + Audio |
| DB005 | Lessing | 149MB | Literature |
| DB007 | Heine | 226MB | Literature |
| DB009 | Killy Literaturlexikon | 137MB | Reference |
| DB011 | Marx/Engels | 117MB | Political Philosophy |
### π Collection Analysis
**101 volumes** discovered with **~50GB** total content:
- **DB002-DB061**: Main literature collection (philosophy, literature, history)
- **DBSK01-DBSK05**: Schnellkurs (crash courses)
- **DBSO01-DBSO28**: Sonderausgaben (special editions)
### ποΈ File Format Structure
Each volume uses a proprietary binary format:
#### Core Files (Data/):
- **TEXT.DKI**: Main text database (structured binary records)
- **TREE.DK***: Navigation tree (table of contents)
- **INDEX.***: Multiple search indices (HTX, PLX, SHX, SWX, TTX, WLX)
- **LINKS.***: Hyperlinks and cross-references
- **SIGEL.DAT**: Abbreviations/signatures registry
#### Media Files:
- **IMAGES/**: BMP illustrations and diagrams
- **WAVS/**: Audio files (readings, lectures)
- **TABLES/**: Specialized content tables
## π Quick Start
### Prerequisites
- Python 3.11+
- Access to Directmedia "Digitale Bibliothek" collection
- FastMCP 2.13+
### Installation
```bash
pip install -e .
```
### Basic Usage
```python
from directmedia_mcp import DirectmediaLibrary
# Initialize library
lib = DirectmediaLibrary(r"L:\Multimedia Files\Written Word\Digitale Bibliothek")
# List all volumes
volumes = lib.list_volumes()
print(f"Found {len(volumes)} volumes")
# Search for content
results = lib.search_text("Nietzsche", "DB002") # Philosophy volume
# Extract text
content = lib.get_text_content("DB002", 0, 1000)
```
### MCP Server Usage
```bash
# Start MCP server
python -m directmedia_mcp.server --library-path "L:\Multimedia Files\Written Word\Digitale Bibliothek"
# Or run directly
directmedia-mcp --library-path "L:\Multimedia Files\Written Word\Digitale Bibliothek"
```
## π§ MCP Tools
### Library Management
- `set_library_path(path)` - Configure library location
- `list_volumes()` - List all available volumes
- `get_volume_info(volume_id)` - Get volume metadata
### Content Access
- `search_text(query, volume_id, limit)` - Search across volumes
- `get_text_content(volume_id, start_pos, length)` - Extract text
- `get_navigation_tree(volume_id)` - Get table of contents
### EPUB Conversion β **NEW**
- `convert_volume_to_epub_file(volume_id, output_dir)` - Convert single volume to EPUB
- `batch_convert_to_epub(output_dir, volume_ids)` - Convert multiple volumes to EPUB
### Analysis
- `analyze_volume_structure(volume_id)` - File format analysis
## π Volume Overview
| Volume ID | Title | Size | Content Type |
|-----------|-------|------|--------------|
| DB002 | Philosophie von Platon bis Nietzsche | 267MB | Philosophy |
| DB003 | Geschichte der Philosophie | 180MB | Philosophy |
| DB004 | Goethe | 150MB | Literature + Audio |
| DB005 | Lessing | 75MB | Literature |
| ... | ... | ... | ... |
## π Technical Details
### Binary Format Analysis
**TEXT.DKI Structure:**
- Header: 256 bytes with section offset table
- Content: Structured binary records (not compressed!)
- Each record: 2-byte length + 1-byte type + text content
**TREE.DK* Structure:**
- DKA: Navigation tree with entry counts and offsets
- DKI: Tree structure data
**INDEX Files:**
- HTX: Hypertext index for navigation
- PLX: Plaintext index for full-text search
- SHX/SWX: Specialized search indices
- TTX: Title index
- WLX: Word list index
### Known Limitations
1. **Proprietary Format**: No official documentation available
2. **Advanced Features**: Some INDEX and TREE.DK* structures still being analyzed
3. **Encoding**: Primarily Latin-1 with some UTF-8 elements
4. **Media Content**: Images and audio files not yet processed
### Recent Achievements β
- [x] **TEXT.DKI Decompression**: Successfully reversed structured binary record format
- [x] **Text Extraction**: Working decompressor extracts readable German text
- [x] **EPUB Conversion**: Convert volumes to modern e-book format
- [x] **MCP Integration**: Full programmatic access via FastMCP server
- [x] **Volume Management**: Complete 101-volume library access
- [x] **TREE.DKI Navigation**: Table of contents successfully parsed
## π **EPUB Conversion Feature**
Convert extracted Directmedia text content into modern EPUB format for e-book readers!
### **What It Does**
- **Extracts** readable text from Directmedia `.DKI` files
- **Formats** content with proper HTML structure and CSS styling
- **Creates** valid EPUB 3.0 files compatible with all e-book readers
- **Preserves** German text encoding and special characters
- **Adds** metadata including title, author, and volume information
### **EPUB Features**
- **Proper Structure**: Mimetype, container.xml, OPF package, navigation
- **German Typography**: Optimized for German text with proper quotes and spacing
- **Responsive Design**: CSS styling that works on all devices
- **Table of Contents**: Navigation structure for easy browsing
- **Metadata**: Complete Dublin Core metadata for library management
### **Usage Examples**
**Convert single volume:**
```bash
# Via MCP tool
convert_volume_to_epub_file("DB002", "./epub_output")
```
**Batch convert multiple volumes:**
```bash
# Via MCP tool
batch_convert_to_epub("./epub_library", ["DB002", "DB003", "DB004"])
```
### **Output Example**
```
epub_output/
βββ Goethe - Faust.epub # Volume DB004
βββ Heine - Buch der Lieder.epub # Volume DB007
βββ ... (more volumes)
```
### **EPUB Reader Compatibility**
- β
**Calibre** (recommended for library management)
- β
**Apple Books** (iOS/macOS)
- β
**Google Play Books**
- β
**Kindle** (via conversion)
- β
**Adobe Digital Editions**
- β
**All major e-book readers**
### Future Enhancements
- [ ] Complete INDEX file parsing for full-text search
- [ ] TREE.DK* advanced structure decoding
- [ ] Cross-volume search optimization
- [ ] Image extraction and processing
- [ ] Audio file handling
## π€ Contributing
This is a research project to preserve and provide access to classic digital literature. Contributions welcome for:
- Binary format analysis
- Decompression algorithms
- Search optimization
- Documentation improvements
## βοΈ **Legal Notice & Copyright**
### **Important: Legal Use Required**
This software tool is designed to work with **legally purchased** copies of Directmedia Publishing's "Digitale Bibliothek" CD-ROM collection. **You must own legitimate copies of the CD-ROMs to use this tool legally.**
#### **Where to Purchase**
Directmedia Publishing still operates and offers their complete collection:
- **Official Website**: [https://www.directmedia-publishing.de/](https://www.directmedia-publishing.de/)
- **Product**: "Digitale Bibliothek" (Complete 101-volume collection)
- **Format**: Available as digital downloads and physical media
- **Languages**: German literature and philosophy collections
#### **Copyright Notice**
- **Copyright**: Β© Directmedia Publishing GmbH
- **Content**: All text, images, and multimedia content remain copyrighted
- **Usage**: Personal, educational, and research use permitted with legal copies
- **Redistribution**: Not permitted without explicit permission
#### **Disclaimer**
This tool is provided for **educational and research purposes** to access legally obtained digital content. The authors are not responsible for misuse of this software. Ensure you comply with all applicable copyright laws in your jurisdiction.
**Pirated or illegally obtained content is not supported and may violate copyright law.**
## π License
MIT License - see LICENSE file for details.
## π Acknowledgments
- Directmedia Publishing for pioneering electronic literature in the 1990s
- The German digital humanities community
- FastMCP framework for MCP implementation