Doclea MCP

Official

Overview Schema Related Servers Score Discussions

doclea-mcp
docs

library-replacement-plan.md•5.85 KiB

# Doclea MCP - Library Replacement Plan ## Status **Completed:** - P0: SCIP code analysis (interfaces/types now detected) - P2: lru-cache (293 LOC → 163 LOC) - P3: string-similarity (151 LOC → 94 LOC) **Skipped:** - Drizzle ORM (not needed - raw SQL works fine) - LangChain embeddings (not needed now) --- ## New Opportunities (Deep Sweep Results) ### Tier 1: Quick Wins (< 1 hour each) | Item | Current | Replace With | Savings | Effort | |------|---------|--------------|---------|--------| | Token counting | 264 LOC custom | `js-tiktoken` | ~200 LOC | 30 min | | Percentile calc | Hand-rolled | `simple-statistics` | ~50 LOC | 15 min | | Stop words | Hardcoded array | `stopword` | ~20 LOC | 10 min | | Tag slugs | Manual normalize | `slugify` | ~30 LOC | 10 min | | Date operations | Manual Date math | `date-fns` | Cleaner code | 20 min | | ID generation | crypto.randomUUID | `nanoid` | Smaller, faster | 10 min | ### Tier 2: Medium Effort (1-3 hours) | Item | Current | Replace With | Savings | Effort | |------|---------|--------------|---------|--------| | Qdrant client | 185 LOC custom | `@langchain/qdrant` | ~100 LOC | 1 hour | | Config loading | Manual JSON parse | `cosmiconfig` | Better UX | 2 hours | | Deep merge | Manual recursion | `deepmerge` | ~40 LOC | 30 min | | Decay functions | Hand-rolled math | `d3-scale` | Cleaner code | 1 hour | ### Tier 3: Optional / Future | Item | Current | Replace With | Notes | |------|---------|--------------|-------| | NLP processing | None | `natural.js` | For better keyword extraction | | Graph analysis | Manual arrays | `graph-data-structure` | If relation analysis grows | | Feature flags | Custom A/B | `growthbook` | If experiments scale | | Logging | console.log | `pino` | If observability needed | --- ## Recommended Implementation: Tier 1 Only Focus on quick wins with highest ROI: ### 1. Token Counting → js-tiktoken **Current:** `src/utils/tokens.ts` (264 LOC) - Custom tokenizer caching - Manual model fallback logic - HuggingFace transformers overhead **After:** ```bash bun add js-tiktoken ``` ```typescript import { getEncoding } from "js-tiktoken"; const enc = getEncoding("cl100k_base"); // GPT-4/3.5 tokenizer const tokens = enc.encode(text); ``` **Files Changed:** - REWRITE `src/utils/tokens.ts` (264 → ~60 LOC) --- ### 2. Percentile Calculations → simple-statistics **Current:** `src/ab-testing/metrics-collector.ts` lines 310-324 - Hand-rolled p50, p95, p99 calculation **After:** ```bash bun add simple-statistics ``` ```typescript import { quantile } from "simple-statistics"; const p50 = quantile(values, 0.5); const p95 = quantile(values, 0.95); ``` **Files Changed:** - MODIFY `src/ab-testing/metrics-collector.ts` (~15 LOC removed) --- ### 3. Stop Words → stopword **Current:** `src/tagging/taxonomy.ts` - hardcoded STOP_WORDS array **After:** ```bash bun add stopword ``` ```typescript import { removeStopwords } from "stopword"; const keywords = removeStopwords(words); ``` **Files Changed:** - MODIFY `src/tagging/taxonomy.ts` (~20 LOC removed) --- ### 4. Tag Normalization → slugify **Current:** Manual lowercase + replace logic in taxonomy **After:** ```bash bun add slugify ``` ```typescript import slugify from "slugify"; const normalizedTag = slugify(tag, { lower: true, strict: true }); ``` **Files Changed:** - MODIFY `src/tagging/taxonomy.ts` (~30 LOC simplified) --- ### 5. Date Operations → date-fns **Current:** Manual Date arithmetic in staleness/scoring **After:** ```bash bun add date-fns ``` ```typescript import { differenceInDays, subDays, isAfter } from "date-fns"; const daysSince = differenceInDays(new Date(), createdAt); ``` **Files Changed:** - MODIFY `src/staleness/*.ts` (cleaner code) - MODIFY `src/scoring/*.ts` (cleaner code) --- ### 6. ID Generation → nanoid **Current:** `crypto.randomUUID()` or SHA-256 hashing **After:** ```bash bun add nanoid ``` ```typescript import { nanoid } from "nanoid"; const id = nanoid(); // 21 chars, URL-safe const shortId = nanoid(10); // custom length ``` **Benefits:** - 2x faster than UUID - Smaller (21 chars vs 36) - URL-safe by default **Files Changed:** - MODIFY ID generation across codebase --- ## What NOT to Replace (Core IP) These are well-designed and provide competitive advantage: | Module | LOC | Reason to Keep | |--------|-----|----------------| | Code chunking | 1,321 | Tree-sitter integration, semantic splitting | | Markdown chunking | 545 | Header-aware, metadata tracking | | Staleness detection | 577 | Multi-strategy, domain-specific | | Scoring system | 779 | Multi-factor, explainable | | Relation detection | 2,457 | Cross-layer, LLM-augmented | | Vector abstraction | 645 | Multi-backend (sqlite-vec, libSQL) | | Database layer | 2,217 | Clean SQL, no ORM needed | --- ## External Tools (Other Languages) Analyzed but NOT recommended: | Tool | Language | Why Skip | |------|----------|----------| | ripgrep | Rust | Bun glob already fast enough | | tree-sitter-cli | Rust | Already using JS binding | | scip-python | Python | Only if Python projects needed | | sentence-transformers | Python | HF transformers.js sufficient | --- ## Install Commands ```bash # Tier 1 (Quick Wins) bun add js-tiktoken simple-statistics stopword slugify date-fns nanoid # Tier 2 (Optional) bun add @langchain/qdrant cosmiconfig deepmerge d3-scale ``` --- ## Verification After Tier 1: ```bash # Build should pass bun build src/index.ts --outdir=dist --target=bun # Tests should pass bun test # Token counting should work bun run -e "import { getEncoding } from 'js-tiktoken'; console.log(getEncoding('cl100k_base').encode('hello').length)" ``` --- ## Summary | Tier | Items | LOC Saved | Effort | |------|-------|-----------|--------| | Tier 1 | 6 packages | ~300 LOC | 2 hours | | Tier 2 | 4 packages | ~150 LOC | 4 hours | | **Total** | 10 packages | **~450 LOC** | 6 hours | **Recommendation:** Do Tier 1 only. High ROI, low risk.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/docleaai/doclea-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

library-replacement-plan.md•5.85 KiB