# π Unified Docs Hub - Comprehensive Contributor Guide
Welcome to the ultimate guide for contributing to Unified Docs Hub! This document consolidates all essential information from our documentation to help you become an effective contributor.
## π― Project Overview
### What is Unified Docs Hub?
Unified Docs Hub is an MCP (Model Context Protocol) server that creates a massive, searchable knowledge base from:
- **170+ hand-curated repositories** across 23 categories
- **1,000+ auto-discovered popular GitHub projects**
- **11,000+ searchable documentation files**
This enables AI assistants like Claude and ChatGPT to have instant access to comprehensive technical documentation.
### Key Features
- π **Lightning-fast full-text search** using SQLite FTS5
- π **Self-updating system** with automated daily updates
- π€ **AI-optimized** for MCP-compatible assistants
- π― **Specialized coverage** in Trading/Finance, AI/ML, DevOps, and 20+ categories
## π Current Repository Distribution
### Categories and Coverage (170 Curated Repositories)
| Category | Count | Examples |
|----------|-------|----------|
| **Trading & Finance** | 64 | Freqtrade, Zipline, QuantLib, CCXT, PyPortfolioOpt |
| **Web Development** | 13 | React, Vue.js, Next.js, Astro |
| **AI/ML** | 13 | DSPy, MCP Agent, Unsloth, RAGbits |
| **Cloud/DevOps** | 9 | Kubernetes, Docker, Terraform, NetBird |
| **MLOps** | 6 | MLflow, Kubeflow, DVC, Weights & Biases |
| **Mobile Development** | 5 | React Native, Flutter, Ionic |
| **Observability** | 5 | Prometheus, Grafana, Jaeger, OpenTelemetry |
| **Desktop Frameworks** | 5 | Electron, Tauri, Wails, Flutter Desktop |
| **Analytics & BI** | 5 | Apache Superset, Metabase, ClickHouse |
| **Programming Languages** | 6 | Python, Go, TypeScript guides |
| **Databases** | 5 | PostgreSQL, MongoDB, Redis |
| **DevTools** | 4 | VS Code, tmux, fzf |
| **Tools** | 4 | RustDesk, Note Gen |
| **Testing** | 3 | Cypress, Playwright, Vitest |
| **Data Engineering** | 3 | Apache Airflow, dbt, Great Expectations |
| **Security** | 3 | OWASP guides, security tools |
| **Game Development** | 2 | Unity docs, Godot guides |
| **Blockchain** | 2 | OpenZeppelin, Hardhat |
| **Education** | 2 | Coding Interview University, Project Based Learning |
| **Robotics** | 1 | ArduPilot |
| **State Management** | 1 | State management patterns |
| **CI/CD** | 1 | GitHub Actions, GitLab CI |
## π How to Contribute
### 1. Adding New Repositories
The easiest way to contribute is by adding high-quality repositories to our curated list.
#### Selection Criteria
- β Minimum 1,000 stars (flexible for niche but essential tools)
- π Comprehensive documentation (README, guides, examples)
- π Active maintenance (recent commits, responsive issues)
- π Community adoption (used in production)
- π― Fills a gap in our current coverage
#### How to Add a Repository
1. **Edit `repositories.yaml`**:
```yaml
- name: "owner/repo-name"
category: "Category Name"
description: "Brief description of what this repo does"
documentation_patterns:
- "README.md"
- "docs/**/*.md"
- "examples/**/*.md"
skip_patterns:
- "node_modules"
- "test"
```
2. **Choose the Right Category**:
- Use existing categories when possible
- Create new categories only for distinct technology areas
- Aim for 5+ repos per category
3. **Documentation Patterns**:
- Include main documentation files
- Add subdirectory patterns for nested docs
- Include example directories if educational
### 2. Expanding Trading & Finance Coverage
Our Trading & Finance category is our crown jewel with 64 specialized repositories. Here's what we cover:
#### Current Specializations
1. **Core Trading** (21 repos) - Backtesting, bots, technical analysis
2. **Options Trading** (3 repos) - Greeks, pricing models, strategies
3. **Forex Trading** (3 repos) - Currency markets, MT5 integration
4. **High-Frequency Trading** (4 repos) - Microstructure, tick data
5. **Market Making** (2 repos) - Order book dynamics, MM algorithms
6. **Portfolio Optimization** (3 repos) - Modern portfolio theory, ML
7. **Arbitrage** (3 repos) - Cross-exchange, triangular, statistical
8. **Risk Management** (3 repos) - VaR, stress testing, metrics
9. **Sentiment Analysis** (2 repos) - FinBERT, news sentiment
10. **DeFi/Blockchain** (4 repos) - Uniswap, Aave, MEV strategies
11. **Fixed Income** (3 repos) - Bond pricing, yield curves
12. **Commodities** (2 repos) - Futures, energy markets
13. **Derivatives** (3 repos) - Complex derivatives pricing
14. **Market Data** (3 repos) - Real-time feeds, data vendors
15. **Trading Infrastructure** (5 repos) - Arctic, Nautilus, FIX
#### Adding Trading Repositories
When adding trading/finance repos, ensure they:
- Have production-ready code
- Include backtesting capabilities
- Document risk management
- Provide real examples
### 3. Code Contributions
#### Architecture Overview
```
unified-docs-hub/
βββ unified_docs_hub_server.py # Main MCP server
βββ database.py # SQLite + FTS5 operations
βββ github_client.py # GitHub API integration
βββ quality_scorer.py # Documentation quality metrics
βββ search_analytics.py # Search behavior tracking
βββ format_handlers.py # Multi-format support
βββ response_limiter.py # Rate limiting
βββ automated_index_updater.py # Daily update automation
```
#### Key Components
1. **Database Layer** (`database.py`)
- SQLite with FTS5 for full-text search
- Document versioning
- Quality score storage
- Search analytics
2. **GitHub Integration** (`github_client.py`)
- Repository discovery
- Documentation fetching
- Rate limit handling
- Update checking
3. **Quality Scoring** (`quality_scorer.py`)
- Multi-metric evaluation
- Letter grades (A+ to F)
- Improvement suggestions
- Freshness tracking
4. **Format Support** (`format_handlers.py`)
- Markdown, MDX, RST, AsciiDoc, Jupyter
- Unified markdown conversion
- Code block extraction
#### Development Setup
```bash
# Clone the repository
git clone https://github.com/boodrow/unified-docs-hub.git
cd unified-docs-hub
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment
export GITHUB_TOKEN=your_github_token
# Run the server
python unified_docs_hub_server.py
```
### 4. Testing Your Changes
Before submitting a PR:
1. **Test Repository Addition**:
```python
# In Python REPL
from database import UnifiedDocsDatabase
from github_client import GitHubClient
db = UnifiedDocsDatabase()
client = GitHubClient("your_token")
# Test indexing your new repo
await client.index_repository("owner/repo", db)
```
2. **Test Search**:
```python
# Search for content from your repo
results = db.search("your search terms")
print(f"Found {len(results)} results")
```
3. **Verify Quality Scores**:
```python
from quality_scorer import DocumentQualityScorer
scorer = DocumentQualityScorer(client)
score = await scorer.calculate_score("owner/repo", doc_content)
```
### 5. Automated Indexing System
Our automated system ensures the knowledge base stays current:
#### Daily Updates (via cron)
- Updates all indexed repositories
- Checks for new documentation
- Recalculates quality scores
- Logs changes and errors
#### Weekly Discovery
- Finds new popular repositories
- Auto-adds repos with 10k+ stars
- Categorizes using AI/heuristics
- Sends summary reports
#### Setting Up Automation
```bash
# Make scripts executable
chmod +x automated_index_updater.py
chmod +x setup_automated_indexing.sh
chmod +x run_automated_index.sh
# Install cron jobs
./setup_automated_indexing.sh
```
## π Recent Expansions
### Phase 1: Initial Curated Set (34 repos)
- Core web frameworks
- Essential tools
- Popular languages
### Phase 2: Trending Additions (+51 repos)
- AI/ML tools (DSPy, MCP agents)
- Education resources
- Blockchain development
- Data engineering stack
### Phase 3: Specialized Categories (+21 repos)
- Observability (Prometheus, Grafana)
- MLOps (MLflow, Kubeflow)
- Desktop frameworks (Electron, Tauri)
- Analytics & BI (Superset, Metabase)
### Phase 4: Trading Mega-Expansion (+43 repos)
- Options trading (Greeks, volatility)
- Forex markets
- High-frequency trading
- Market making
- DeFi protocols
- Risk management
### Phase 5: Ultimate Trading Coverage (+21 more)
- Commodities & futures
- Energy markets
- Fixed income
- Derivatives
- Infrastructure
## π― Contribution Ideas
### High-Priority Additions
1. **DevOps Tools**: More Kubernetes operators, service meshes
2. **Data Science**: Jupyter extensions, visualization libraries
3. **API Development**: GraphQL tools, API gateways
4. **Testing Frameworks**: Property-based testing, load testing
5. **Documentation Tools**: Static site generators, API doc tools
### Feature Enhancements
1. **Multi-language Support**: Internationalization
2. **Custom Sources**: Support for non-GitHub docs
3. **Smart Categorization**: ML-based auto-categorization
4. **Quality Improvements**: Better scoring algorithms
5. **Search Enhancements**: Semantic search, filters
### Infrastructure
1. **Performance**: Query optimization, caching
2. **Monitoring**: Health checks, metrics
3. **Deployment**: Docker, Kubernetes configs
4. **Documentation**: Video tutorials, examples
## π Pull Request Guidelines
### PR Checklist
- [ ] Follows existing code style
- [ ] Includes appropriate documentation
- [ ] Adds tests for new functionality
- [ ] Updates README if needed
- [ ] Commits are clear and atomic
- [ ] No sensitive data (tokens, keys)
### Commit Messages
```
feat: Add support for GitLab repositories
fix: Handle rate limiting in GitHub client
docs: Update trading category documentation
chore: Upgrade dependencies
```
## π€ Community
### Getting Help
- **Issues**: Use GitHub issues for bugs/features
- **Discussions**: GitHub Discussions for questions
- **Discord**: Join MCP Discord server
- **Email**: Contact maintainers
### Code of Conduct
- Be respectful and inclusive
- Welcome newcomers
- Provide constructive feedback
- Focus on what's best for the project
## π Making an Impact
Your contributions help thousands of developers and AI assistants access better documentation. Whether you add a single repository or implement a new feature, you're making AI-assisted development more powerful and accessible.
### Recognition
- Contributors are listed in README
- Significant contributions highlighted
- Regular contributors get maintainer status
## π
Roadmap
### Q3 2025
- Semantic search implementation
- Multi-source support (GitLab, Bitbucket)
- Real-time documentation updates
### Q4 2025
- Documentation quality badges
- Community curation platform
- Enterprise features
### 2026
- Federated documentation network
- AI-powered categorization
- Interactive documentation explorer
---
**Ready to contribute?** Pick an area that interests you and dive in! Every contribution, no matter how small, makes our documentation ecosystem better. π