Skip to main content
Glama
ARCHITECTURE.md7.56 kB
# octocode-utils Architecture Shared utilities for Octocode MCP packages, focused on AI/LLM token efficiency. ## Package Overview | Property | Value | |----------|-------| | **Version** | 5.0.0 | | **Purpose** | Content transformation utilities for optimal AI consumption | | **Node** | >=20.0.0 | | **Format** | ESM + CJS dual export | ## Directory Structure ``` octocode-utils/ ├── src/ │ ├── index.ts # Public API exports │ ├── jsonToYamlString.ts # JSON→YAML conversion (2.5KB) │ └── minifier.ts # Multi-strategy minification (21KB) ├── tests/ │ ├── jsonToYamlString.config.test.ts # Config options testing │ ├── jsonToYamlString.output.test.ts # Token efficiency validation │ └── minifier.test.ts # Strategy & error handling ├── dist/ # Build output (CJS + ESM + types) └── package.json ``` ## Public API ```typescript // Main exports export { jsonToYamlString, YamlConversionConfig } from './jsonToYamlString'; export { minifyContent } from './minifier'; ``` --- ## Utility #1: jsonToYamlString ### Purpose Converts JSON to YAML format optimized for **AI/LLM token efficiency** (20-40% token reduction compared to JSON). ### API ```typescript function jsonToYamlString( jsonObject: unknown, config?: YamlConversionConfig ): string interface YamlConversionConfig { sortKeys?: boolean; // Alphabetical sorting keysPriority?: string[]; // Priority keys appear first (recursive) } ``` ### Implementation Details Uses `js-yaml` library with optimized settings: | Option | Value | Purpose | |--------|-------|---------| | `forceQuotes` | `true` | Consistent string quoting | | `quotingType` | `"` | Double quotes for JSON-like syntax | | `lineWidth` | `-1` | No line wrapping | | `noRefs` | `true` | No YAML references | | `indent` | `2` | 2-space indentation | ### Key Sorting Logic 1. **No config** → Preserves original key order 2. **`sortKeys: true`** → Alphabetical via `localeCompare` 3. **`keysPriority: [...]`** → Priority keys first, then alphabetical **Important**: Priority sorting is **recursive** - applies to all nested objects at every level. ### Error Handling ``` YAML conversion fails └→ Fallback to JSON.stringify(obj, null, 2) └→ If that fails, return commented error message ``` --- ## Utility #2: minifyContent ### Purpose Multi-strategy content minification for **50+ file types**, optimized for token reduction while preserving semantic meaning. ### API ```typescript async function minifyContent( content: string, filePath: string ): Promise<{ content: string; failed: boolean; type: 'terser' | 'conservative' | 'aggressive' | 'json' | 'general' | 'markdown' | 'failed'; reason?: string; }> ``` ### Strategy Selection Strategy is selected based on file extension via `MINIFY_CONFIG.fileTypes`: | Strategy | File Types | Method | |----------|------------|--------| | **terser** | `.js`, `.ts`, `.jsx`, `.tsx`, `.mjs`, `.cjs` | AST-based minification via Terser | | **aggressive** | `.html`, `.css`, `.go`, `.java`, `.rust`, `.swift`, `.kotlin`, etc. | Library-based (CleanCSS/html-minifier) + regex fallback | | **conservative** | `.py`, `.yaml`, `.yml`, `.coffee`, `.sass`, `.pug` | Remove comments, preserve indentation | | **json** | `.json` | `JSON.parse` → `JSON.stringify` (compact) | | **markdown** | `.md`, `.markdown` | Remove HTML comments, normalize spacing | | **general** | `.txt`, `.log`, unknown extensions | Basic whitespace normalization | ### Comment Pattern Groups The minifier recognizes different comment syntaxes grouped by language family: | Group | Pattern Examples | Languages | |-------|------------------|-----------| | `c-style` | `/* */`, `//` | JS, Go, Java, C, Rust, Swift | | `hash` | `#` | Python, Ruby, Shell, YAML | | `html` | `<!-- -->` | HTML, XML, SVG | | `sql` | `--`, `/* */` | SQL | | `lua` | `--`, `--[[ ]]` | Lua | | `template` | `{{!-- --}}`, `<%# %>`, `{# #}` | Handlebars, EJS, Jinja, Twig | | `haskell` | `--`, `{- -}` | Haskell | ### Special File Handling **Indentation-Sensitive Files** (use `conservative` strategy): - `Makefile`, `Dockerfile`, `Procfile`, `Jenkinsfile` - `Rakefile`, `Gemfile`, `Podfile`, `Vagrantfile` - And other build/config files without extensions ### Safety Features 1. **Size Limit**: 1MB maximum (returns original with error if exceeded) 2. **Graceful Fallbacks**: Regex-based minification if library fails 3. **Never Throws**: Always returns content, even on failure 4. **Indentation Preservation**: Detects and protects sensitive files ### Helper Exports ```typescript // Check if file uses Terser strategy isJavaScriptFileV2(filePath: string): boolean // Check if file is indentation-sensitive isIndentationSensitiveV2(filePath: string): boolean // Full configuration object MINIFY_CONFIG: MinifyConfig ``` --- ## Dependencies | Package | Version | Purpose | |---------|---------|---------| | `terser` | ^5.19.0 | JavaScript/TypeScript AST minification | | `clean-css` | ^5.3.0 | CSS minification | | `html-minifier-terser` | ^7.2.0 | HTML minification | | `js-yaml` | ^4.1.0 | YAML serialization | ### Dev Dependencies (Testing) | Package | Purpose | |---------|---------| | `@microsoft/tiktokenizer` | Token counting for efficiency validation | | `vitest` | Test framework | --- ## Design Principles ### 1. Token Efficiency First Primary goal is reducing LLM token consumption: - YAML format uses fewer tokens than JSON - Comment removal reduces noise - Whitespace normalization compacts output ### 2. Graceful Degradation Every function returns usable content: ``` Primary method fails └→ Try fallback method └→ Return original content with error flag ``` ### 3. File-Type Awareness Strategy selection based on: 1. File extension (primary) 2. Filename patterns (Makefile, Dockerfile, etc.) 3. Sensible defaults for unknown types ### 4. Recursive Configuration Key sorting in `jsonToYamlString` applies uniformly at all nesting levels, ensuring consistent structure throughout complex objects. --- ## Usage Patterns ### Basic JSON to YAML ```typescript import { jsonToYamlString } from 'octocode-utils'; const data = { name: "John", age: 30, roles: ["admin"] }; const yaml = jsonToYamlString(data); // name: "John" // age: 30 // roles: // - "admin" ``` ### Priority-Based Key Sorting ```typescript const yaml = jsonToYamlString(data, { keysPriority: ["id", "name", "type"] }); // id appears first, then name, then type, then alphabetical ``` ### Content Minification ```typescript import { minifyContent } from 'octocode-utils'; const result = await minifyContent(jsCode, 'app.ts'); if (!result.failed) { console.log(`Minified with ${result.type} strategy`); console.log(result.content); } ``` --- ## Test Coverage | Test File | Lines | Focus | |-----------|-------|-------| | `minifier.test.ts` | ~422 | Strategy selection, comment removal, error handling | | `jsonToYamlString.config.test.ts` | ~415 | Config options, key sorting, recursion | | `jsonToYamlString.output.test.ts` | ~735 | Token efficiency validation | Tests use `@microsoft/tiktokenizer` (`cl100k_base` encoder) to verify actual token savings. --- ## Integration Points This package is consumed by: - `octocode-mcp` - GitHub code research MCP server - `octocode-local` - Local filesystem MCP server Both use these utilities to optimize content before returning to LLM clients.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bgauryy/octocode-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server