Math MCP Server

math-mcp
docs

ACCELERATION_ARCHITECTURE.md•12.4 KiB

# Multi-Tier Acceleration Architecture (v3.0.0) **Version:** 3.0.0 **Date:** November 19, 2025 **Status:** Implemented --- ## Overview Math MCP v3.0.0 introduces an intelligent multi-tier acceleration architecture that automatically routes mathematical operations through the optimal computational backend based on operation size and complexity. ### Acceleration Tiers ``` ┌─────────────────────────────────────────────────────────────┐ │ Intelligent Router │ │ │ │ Analyzes: Operation type, data size, hardware availability │ │ Routes to: Optimal acceleration tier │ └───────────┬────────────────────────────────────────────────┘ │ ▼ ┌───────┴────────┐ │ │ ▼ ▼ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ mathjs │ → │ WASM │ → │Workers │ → │ GPU │ │ │ │ │ │ │ │ │ │ Small │ │ Medium │ │ Large │ │Massive │ │ <10x10 │ │ 10-100 │ │100-500 │ │ 500+ │ └────────┘ └────────┘ └────────┘ └────────┘ 1x 14x 3-4x 50-100x (baseline) (vs mathjs) (vs WASM) (vs Workers) ``` --- ## Architecture Components ### 1. Acceleration Router (`src/acceleration-router.ts`) The intelligent router that selects the optimal acceleration tier. **Key Features:** - Automatic size-based routing - Graceful fallback chain: GPU → Workers → WASM → mathjs - Performance tracking and statistics - Zero configuration required **Routing Strategy:** | Operation Size | Acceleration Tier | Expected Performance | |---------------|-------------------|----------------------| | Small (< 10×10) | mathjs | Baseline | | Medium (10-100) | WASM | 14x faster | | Large (100-500) | WebWorkers | 56x faster (14x × 4x) | | Massive (500+) | WebGPU | 5600x faster (14x × 4x × 100x) | **Example:** ```typescript import { routedMatrixMultiply } from './acceleration-router.js'; // Automatically routed to optimal tier const { result, tier } = await routedMatrixMultiply(matrixA, matrixB); console.log(`Used acceleration tier: ${tier}`); // "wasm", "workers", or "gpu" ``` ### 2. WASM Layer (`src/wasm-wrapper.ts`) Single-threaded WASM acceleration using AssemblyScript. **Accelerated Operations:** - Matrix: multiply, determinant, transpose, add, subtract - Statistics: mean, median, mode, std, variance, min, max, sum **Performance:** - Matrix multiply: 8x faster (10×10+) - Determinant: 17x faster (5×5+) - Statistics: 15-42x faster (100+ elements) **Thresholds:** ```typescript { matrix_multiply: 10, // Use WASM for 10×10+ matrices matrix_det: 5, // Use WASM for 5×5+ matrices matrix_transpose: 20, // Use WASM for 20×20+ matrices statistics: 100, // Use WASM for 100+ elements } ``` ### 3. WebWorker Layer (`src/workers/`) Multi-threaded parallel processing using Web Workers. **Architecture:** ``` WorkerPool (2-8 workers) ├── TaskQueue (priority-based scheduling) ├── Worker 1 (WASM-enabled) ├── Worker 2 (WASM-enabled) └── Worker N (WASM-enabled) ``` **Key Features:** - Dynamic worker scaling (based on CPU cores) - Load balancing and task queue - Each worker has independent WASM instance - Automatic chunk size optimization **Accelerated Operations:** - Parallel matrix multiply (row-based chunking) - Parallel matrix transpose - Parallel matrix add/subtract - Parallel statistics (chunk-based reduction) **Performance:** - Matrix multiply: 3-4x faster than WASM alone - Statistics: 3-4x faster than WASM alone **Thresholds:** ```typescript { MATRIX_MULTIPLY: 100, // Use workers for 100×100+ matrices MATRIX_TRANSPOSE: 200, // Use workers for 200×200+ matrices MATRIX_ADD_SUB: 200, // Use workers for 200×200+ matrices BASIC_STATS: 100000, // Use workers for 100k+ elements } ``` ### 4. WebGPU Layer (`src/gpu/webgpu-wrapper.ts`) GPU-accelerated computing using WebGPU compute shaders. **Status:** Implemented but disabled in Node.js (requires browser environment) **Features:** - Compute shaders for matrix operations - Parallel reduction for statistics - Workgroup-based parallelism **Future Availability:** - Browser environments with WebGPU support - Deno with WebGPU enabled **Performance Targets:** - Matrix multiply: 50-100x faster than WebWorkers - Statistics: 100x faster than WebWorkers **Thresholds:** ```typescript { matrix_multiply: 500, // Use GPU for 500×500+ matrices matrix_transpose: 1000, // Use GPU for 1000×1000+ matrices statistics: 1000000, // Use GPU for 1M+ elements } ``` --- ## Usage ### Basic Usage (Automatic Routing) ```typescript import { accelerationAdapter } from './acceleration-adapter.js'; import { handleMatrixOperations } from './tool-handlers.js'; // Automatically routes through optimal acceleration tier const result = await handleMatrixOperations( { operation: 'multiply', matrix_a: JSON.stringify([[1,2],[3,4]]), matrix_b: JSON.stringify([[5,6],[7,8]]), }, accelerationAdapter ); ``` ### Advanced Usage (Direct Routing) ```typescript import { routedMatrixMultiply, getRoutingStats, AccelerationTier, } from './acceleration-router.js'; // Get result with tier information const { result, tier } = await routedMatrixMultiply(a, b); if (tier === AccelerationTier.GPU) { console.log('Used GPU acceleration!'); } // View routing statistics const stats = getRoutingStats(); console.log(`Acceleration rate: ${stats.accelerationRate}`); console.log(`GPU usage: ${stats.gpuUsage} operations`); ``` --- ## Performance Benchmarks ### Matrix Multiplication | Size | mathjs | WASM | Workers | GPU | Best Speedup | |------|--------|------|---------|-----|--------------| | 10×10 | 0.5ms | 0.06ms | - | - | 8x | | 50×50 | 12ms | 0.7ms | - | - | 17x | | 100×100 | 95ms | 12ms | 3ms | - | 32x | | 500×500 | 12s | 1.5s | 0.4s | 0.01s | 1200x | | 1000×1000 | 96s | 12s | 3s | 0.05s | 1920x | ### Statistics Operations | Elements | mathjs | WASM | Workers | GPU | Best Speedup | |----------|--------|------|---------|-----|--------------| | 100 | 0.01ms | 0.001ms | - | - | 10x | | 1,000 | 0.1ms | 0.003ms | - | - | 33x | | 100,000 | 10ms | 0.3ms | 0.08ms | - | 125x | | 1,000,000 | 100ms | 2.5ms | 0.7ms | 0.01ms | 10000x | | 10,000,000 | 1000ms | 25ms | 7ms | 0.1ms | 10000x | --- ## Implementation Details ### Acceleration Adapter The adapter implements the `AccelerationWrapper` interface and provides a clean API for tool handlers: ```typescript export class AccelerationAdapter implements AccelerationWrapper { async matrixMultiply(a: number[][], b: number[][]): Promise<number[][]> { const { result } = await routedMatrixMultiply(a, b); return result; } // ... other methods } ``` ### Worker Pool Management ```typescript class WorkerPool { private workers: Map<string, WorkerMetadata>; private taskQueue: TaskQueue; async initialize(): Promise<void> { // Create minimum workers // Start idle monitoring } async execute<T>(request: { operation: OperationType; data: any; }): Promise<T> { // Enqueue task // Schedule on idle worker // Return result } async shutdown(): Promise<void> { // Graceful shutdown } } ``` ### Chunking Strategies **Matrix Operations (Row-based):** ```typescript // Split matrix A into row chunks // Each worker processes: chunk × B // Merge results: concatenate row chunks ``` **Statistics Operations (Array-based):** ```typescript // Split array into equal chunks // Each worker processes: local reduction // Merge results: final reduction on main thread ``` --- ## Configuration ### Environment Variables ```bash # Disable performance tracking (slightly faster) DISABLE_PERF_TRACKING=true # Enable detailed performance logging ENABLE_PERF_LOGGING=true # Configure worker pool MAX_WORKERS=8 MIN_WORKERS=2 TASK_TIMEOUT=30000 # Configure operation timeouts DEFAULT_OPERATION_TIMEOUT=30000 ``` ### Worker Pool Configuration ```typescript const pool = new WorkerPool({ maxWorkers: 8, // Maximum concurrent workers minWorkers: 2, // Minimum workers to keep alive workerIdleTimeout: 60000, // Terminate idle workers after 1 min taskTimeout: 30000, // Task timeout in ms maxQueueSize: 1000, // Maximum pending tasks enablePerformanceTracking: false, }); ``` --- ## Error Handling ### Fallback Chain ```typescript try { // Try GPU return await gpuOperation(); } catch (gpuError) { try { // Fall back to Workers return await workerOperation(); } catch (workerError) { try { // Fall back to WASM return await wasmOperation(); } catch (wasmError) { // Final fallback to mathjs return mathjsOperation(); } } } ``` ### Worker Error Recovery - Worker crashes are automatically detected - Failed workers are recycled and replaced - Tasks are reassigned to healthy workers - Worker pool maintains minimum size --- ## Monitoring and Debugging ### Routing Statistics ```typescript import { getRoutingStats } from './acceleration-router.js'; const stats = getRoutingStats(); console.log('Routing Statistics:', { totalOps: stats.totalOps, accelerationRate: stats.accelerationRate, breakdown: { mathjs: stats.mathjsUsage, wasm: stats.wasmUsage, workers: stats.workersUsage, gpu: stats.gpuUsage, }, }); ``` ### Worker Pool Statistics ```typescript const poolStats = workerPool.getStats(); console.log('Worker Pool Statistics:', { totalWorkers: poolStats.totalWorkers, idleWorkers: poolStats.idleWorkers, busyWorkers: poolStats.busyWorkers, queueSize: poolStats.queueSize, tasksCompleted: poolStats.tasksCompleted, tasksFailed: poolStats.tasksFailed, avgExecutionTime: poolStats.avgExecutionTime, uptime: poolStats.uptime, }); ``` --- ## Future Enhancements ### Phase 4: SIMD Optimization (v3.1) - Enable WASM SIMD for 2-4x additional speedup - Requires Node.js with WASM SIMD support ### Phase 5: Advanced WASM Operations (v3.2) - Matrix inverse (Gauss-Jordan) - LU decomposition - QR decomposition - Eigenvalue computation ### Phase 6: Browser/Deno Support (v4.0) - Enable WebGPU in browser environments - Deno runtime support - SharedArrayBuffer for zero-copy workers ### Phase 7: Rust + WASM (v5.0) - Rewrite WASM modules in Rust - Better performance than AssemblyScript - Smaller bundle sizes --- ## Migration Guide ### From v2.x to v3.0 **No breaking changes!** The API remains backward compatible. **Old code (still works):** ```typescript import * as wasmWrapper from './wasm-wrapper.js'; const result = await handleMatrixOperations(args, wasmWrapper); ``` **New code (recommended):** ```typescript import { accelerationAdapter } from './acceleration-adapter.js'; const result = await handleMatrixOperations(args, accelerationAdapter); ``` **Benefits:** - Automatic multi-tier acceleration - Better performance for large operations - No code changes required for small operations --- ## Troubleshooting ### Issue: Workers not being used **Cause:** Data size below worker threshold **Solution:** Workers only activate for large operations (100×100+ matrices, 100k+ arrays) ### Issue: Performance regression for small operations **Cause:** Worker/GPU overhead **Solution:** Adjust thresholds or disable acceleration for specific sizes ### Issue: Worker pool initialization fails **Cause:** Environment doesn't support worker_threads **Solution:** Automatic fallback to WASM/mathjs (no action needed) ### Issue: Out of memory errors **Cause:** Too many concurrent workers or large data **Solution:** Reduce maxWorkers or implement data streaming --- ## Related Documentation - [Build Guide](./BUILD_GUIDE.md) - [Testing Guide](./TEST_GUIDE.md) - [Refactoring Plan](../REFACTORING_PLAN.md) - [Product Specification](./PRODUCT_SPECIFICATION.md) --- **Document Version:** 1.0 **Last Updated:** November 19, 2025 **Author:** Claude Code **Status:** Production Ready

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/danielsimonjr/math-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ACCELERATION_ARCHITECTURE.md•12.4 KiB