Skip to main content
Glama
mdz-axo

PT-MCP (Paul Test Man Context Protocol)

by mdz-axo
PT-MCP_IMPLEMENTATION_PLAN.md20.4 kB
# PT-MCP Implementation Plan ## YAGO 4.5 & Schema.org Integration > **"Where am I now?"** - Based on proven patterns from Ludwig neurosymbolic system ## Executive Summary PT-MCP will integrate YAGO 4.5 and Schema.org using **battle-tested patterns** discovered in the Ludwig system (`/home/mdz-axolotl/ClaudeCode/Ludwig/`). Ludwig provides production-ready code for: - YAGO entity resolution with SPARQL - Schema.org property mapping - RDF triple storage and querying - Confidence-based auto-linking - Semantic enrichment workflows ## Architecture: Three-Layer Semantic Stack ``` ┌─────────────────────────────────────────────────────┐ │ PT-MCP Server (Model Context Protocol) │ ├─────────────────────────────────────────────────────┤ │ Layer 1: Code Analysis (✅ Implemented) │ │ - File structure & language detection │ │ - Entry point identification │ │ - Package dependency analysis │ └─────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────┐ │ Layer 2: Semantic Enrichment (🚧 To Implement) │ │ ┌─────────────────────┬───────────────────────┐ │ │ │ YAGO Resolver │ Schema.org Mapper │ │ │ │ - Entity linking │ - Type classification │ │ │ │ - Fact retrieval │ - Property extraction │ │ │ │ - SPARQL queries │ - JSON-LD generation │ │ │ └─────────────────────┴───────────────────────┘ │ └─────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────┐ │ Layer 3: Knowledge Graph (🚧 To Implement) │ │ - Triple store (SQLite) │ │ - RDF relationships │ │ - Ontology cache │ │ - Query interface │ └─────────────────────────────────────────────────────┘ ``` ## Phase 1: Foundation (Week 1-2) ### 1.1 Dependencies **Add to `package.json`**: ```json { "dependencies": { "rdflib": "^2.2.34", "n3": "^1.17.2", "sparqljs": "^3.7.1", "sparql-http-client": "^2.4.1", "jsonld": "^8.3.1", "better-sqlite3": "^9.2.2" } } ``` ### 1.2 Database Schema **Create: `src/database/schema.sql`** ```sql -- Core entities from codebase analysis CREATE TABLE entities ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT NOT NULL, type TEXT NOT NULL, -- 'package', 'class', 'function', 'framework', etc. source_file TEXT, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, metadata JSON ); CREATE INDEX idx_entities_name ON entities(name); CREATE INDEX idx_entities_type ON entities(type); -- YAGO mappings CREATE TABLE yago_mappings ( entity_id INTEGER PRIMARY KEY, yago_uri TEXT NOT NULL UNIQUE, yago_type TEXT, -- Schema.org type from YAGO confidence REAL CHECK(confidence >= 0 AND confidence <= 1), facts JSON, -- YAGO facts as key-value pairs cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (entity_id) REFERENCES entities(id) ); CREATE INDEX idx_yago_uri ON yago_mappings(yago_uri); -- Schema.org annotations CREATE TABLE schema_annotations ( entity_id INTEGER PRIMARY KEY, schema_type TEXT NOT NULL, -- e.g., 'WebApplication', 'SoftwareLibrary' properties JSON, -- Schema.org properties context_url TEXT DEFAULT 'https://schema.org', FOREIGN KEY (entity_id) REFERENCES entities(id) ); -- Ontology cache (from YAGO taxonomy) CREATE TABLE ontology_classes ( class_uri TEXT PRIMARY KEY, label TEXT, description TEXT, parent_class TEXT, source TEXT DEFAULT 'yago', -- 'yago' or 'schema.org' cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- SPARQL query cache CREATE TABLE sparql_cache ( query_hash TEXT PRIMARY KEY, query TEXT NOT NULL, result JSON NOT NULL, cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, expires_at TIMESTAMP ); CREATE INDEX idx_sparql_expires ON sparql_cache(expires_at); ``` ### 1.3 Directory Structure ``` src/ ├── database/ │ ├── schema.sql # Database schema │ ├── connection.ts # SQLite connection manager │ └── migrations.ts # Schema migrations ├── services/ │ ├── yago-resolver.ts # YAGO entity resolution │ ├── schema-mapper.ts # Schema.org mapping │ ├── sparql-client.ts # SPARQL query execution │ ├── rdf-parser.ts # RDF/Turtle parsing │ └── ontology-cache.ts # Taxonomy caching ├── tools/ │ ├── enrich-context.ts # NEW: Context enrichment tool │ └── query-knowledge.ts # NEW: Knowledge graph queries └── types/ ├── yago.ts # YAGO types └── schema-org.ts # Schema.org types ``` ## Phase 2: YAGO Integration (Week 3-4) ### 2.1 YAGO Resolver Service **Create: `src/services/yago-resolver.ts`** ```typescript import { SPARQLClient } from './sparql-client.js'; import { Database } from './database/connection.js'; export interface YAGOEntity { uri: string; label: string; type: string; // Schema.org type description?: string; facts: Record<string, string[]>; } export class YAGOResolver { private sparql: SPARQLClient; private db: Database; private cacheT TL = 30 * 24 * 60 * 60 * 1000; // 30 days constructor(db: Database) { this.db = db; this.sparql = new SPARQLClient({ endpoint: 'https://yago-knowledge.org/sparql', fallback: 'https://query.wikidata.org/sparql' }); } /** * Resolve entity name to YAGO URI(s) * Based on Ludwig's yago_client.py */ async resolveEntity(name: string): Promise<YAGOEntity[]> { // Check cache first const cached = this.getCachedMapping(name); if (cached) return cached; // SPARQL query pattern from Ludwig const query = ` PREFIX schema: <http://schema.org/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?entity ?label ?type ?description WHERE { ?entity rdfs:label "${name}"@en . ?entity a ?type . OPTIONAL { ?entity schema:description ?description } FILTER(STRSTARTS(STR(?type), "http://schema.org/")) } LIMIT 10 `; const results = await this.sparql.query(query); const entities = results.map(r => this.parseEntity(r)); // Cache results this.cacheEntities(name, entities); return entities; } /** * Get entity facts (properties and values) * Based on Ludwig's get_entity_facts() */ async getEntityFacts(entityUri: string): Promise<Record<string, string[]>> { const query = ` PREFIX schema: <http://schema.org/> SELECT ?property ?value WHERE { <${entityUri}> ?property ?value . FILTER(STRSTARTS(STR(?property), "http://schema.org/")) } `; const results = await this.sparql.query(query); // Group by property const facts: Record<string, string[]> = {}; for (const row of results) { const prop = row.property; const value = row.value; if (!facts[prop]) facts[prop] = []; facts[prop].push(value); } return facts; } /** * Link entity with confidence scoring * Confidence: 1.0 = exact match, 0.7 = partial match */ async linkEntity(entityId: number, name: string): Promise<void> { const candidates = await this.resolveEntity(name); for (const candidate of candidates) { const confidence = this.calculateConfidence(name, candidate); if (confidence >= 0.9) { // Auto-link high confidence await this.db.execute(` INSERT INTO yago_mappings (entity_id, yago_uri, yago_type, confidence, facts) VALUES (?, ?, ?, ?, ?) ON CONFLICT(entity_id) DO UPDATE SET yago_uri = excluded.yago_uri, yago_type = excluded.yago_type, confidence = excluded.confidence, facts = excluded.facts, cached_at = CURRENT_TIMESTAMP `, [entityId, candidate.uri, candidate.type, confidence, JSON.stringify(candidate.facts)]); break; // Take first high-confidence match } } } private calculateConfidence(name: string, entity: YAGOEntity): number { const nameLower = name.toLowerCase(); const labelLower = entity.label.toLowerCase(); if (nameLower === labelLower) return 1.0; if (labelLower.includes(nameLower) || nameLower.includes(labelLower)) return 0.7; return 0.5; } private getCachedMapping(name: string): YAGOEntity[] | null { // Check cache with TTL const result = this.db.query(` SELECT * FROM yago_mappings ym JOIN entities e ON e.id = ym.entity_id WHERE e.name = ? AND ym.cached_at > datetime('now', '-30 days') `, [name]); if (result.length > 0) { return result.map(r => ({ uri: r.yago_uri, label: r.name, type: r.yago_type, facts: JSON.parse(r.facts) })); } return null; } private cacheEntities(name: string, entities: YAGOEntity[]): void { // Implementation of cache storage } private parseEntity(row: any): YAGOEntity { return { uri: row.entity, label: row.label, type: row.type, description: row.description, facts: {} }; } } ``` ### 2.2 SPARQL Client **Create: `src/services/sparql-client.ts`** ```typescript import fetch from 'node-fetch'; export interface SPARQLClientConfig { endpoint: string; fallback?: string; timeout?: number; } export class SPARQLClient { private config: SPARQLClientConfig; constructor(config: SPARQLClientConfig) { this.config = { timeout: 30000, ...config }; } async query(sparql: string): Promise<any[]> { try { return await this.executeQuery(this.config.endpoint, sparql); } catch (error) { if (this.config.fallback) { console.warn(`Primary endpoint failed, trying fallback: ${error.message}`); return await this.executeQuery(this.config.fallback, sparql); } throw error; } } private async executeQuery(endpoint: string, sparql: string): Promise<any[]> { const params = new URLSearchParams({ query: sparql, format: 'json' }); const response = await fetch(`${endpoint}?${params}`, { method: 'GET', headers: { 'Accept': 'application/sparql-results+json' }, signal: AbortSignal.timeout(this.config.timeout) }); if (!response.ok) { throw new Error(`SPARQL query failed: ${response.statusText}`); } const data = await response.json(); return data.results.bindings.map((b: any) => { const row: any = {}; for (const [key, value] of Object.entries(b)) { row[key] = (value as any).value; } return row; }); } } ``` ## Phase 3: Schema.org Integration (Week 5-6) ### 3.1 Schema.org Mapper **Create: `src/services/schema-mapper.ts`** ```typescript /** * Maps codebase entities to Schema.org types * Based on Ludwig's schema_mapper.py */ export const CODEBASE_TO_SCHEMA: Record<string, string> = { // Application types 'web-app': 'schema:WebApplication', 'mobile-app': 'schema:MobileApplication', 'api': 'schema:WebAPI', 'library': 'schema:SoftwareLibrary', 'package': 'schema:SoftwareLibrary', 'framework': 'schema:SoftwareApplication', // Document types 'documentation': 'schema:TechArticle', 'tutorial': 'schema:HowTo', 'readme': 'schema:TechArticle', 'guide': 'schema:HowTo', // Code elements 'source-file': 'schema:SoftwareSourceCode', 'test-suite': 'schema:SoftwareTest', 'database': 'schema:Dataset', }; export const PROPERTY_MAPPINGS: Record<string, string> = { 'dependencies': 'schema:softwareRequirements', 'version': 'schema:softwareVersion', 'authors': 'schema:author', 'maintainers': 'schema:maintainer', 'license': 'schema:license', 'description': 'schema:description', 'url': 'schema:url', 'repository': 'schema:codeRepository', 'programmingLanguage': 'schema:programmingLanguage', 'runtimePlatform': 'schema:runtimePlatform', }; export class SchemaMapper { /** * Map entity to Schema.org type */ mapType(entityType: string): string { return CODEBASE_TO_SCHEMA[entityType] || 'schema:SoftwareApplication'; } /** * Generate Schema.org JSON-LD annotation */ generateAnnotation(entity: any, analysis: any): object { const schemaType = this.mapType(entity.type); const annotation: any = { '@context': 'https://schema.org', '@type': schemaType.replace('schema:', ''), 'name': entity.name, }; // Map properties if (analysis.packageInfo) { const pkg = analysis.packageInfo; if (pkg.version) annotation['softwareVersion'] = pkg.version; if (pkg.description) annotation['description'] = pkg.description; if (pkg.license) annotation['license'] = pkg.license; // Dependencies as software requirements if (pkg.dependencies) { annotation['softwareRequirements'] = Object.keys(pkg.dependencies); } } // Programming languages from analysis if (analysis.languages) { annotation['programmingLanguage'] = Object.keys(analysis.languages); } return annotation; } /** * Bidirectional property mapping */ toPT MCP(schemaProperty: string): string { for (const [key, value] of Object.entries(PROPERTY_MAPPINGS)) { if (value === schemaProperty) return key; } return schemaProperty; } toSchema(ptmcpProperty: string): string { return PROPERTY_MAPPINGS[ptmcpProperty] || ptmcpProperty; } } ``` ## Phase 4: New MCP Tools (Week 7-8) ### 4.1 Enrich Context Tool **Create: `src/tools/enrich-context.ts`** ```typescript import { YAGOResolver } from '../services/yago-resolver.js'; import { SchemaMapper } from '../services/schema-mapper.js'; import { Database } from '../database/connection.js'; interface EnrichContextArgs { path: string; analysis_result?: any; enrichment_level?: 'minimal' | 'standard' | 'comprehensive'; include_yago?: boolean; include_schema?: boolean; } export async function enrichContext(args: EnrichContextArgs) { const { path, analysis_result, enrichment_level = 'standard', include_yago = true, include_schema = true } = args; // Get codebase analysis if not provided let analysis = analysis_result; if (!analysis) { const analyzeCodebase = await import('./analyze-codebase.js'); const result = await analyzeCodebase.analyzeCodebase({ path }); analysis = JSON.parse(result.content[0].text); } const db = new Database(); const yagoResolver = new YAGOResolver(db); const schemaMapper = new SchemaMapper(); // Extract entities from analysis const entities = extractEntities(analysis); // YAGO enrichment const yagoEnrichment = []; if (include_yago) { for (const entity of entities) { const yagoEntities = await yagoResolver.resolveEntity(entity.name); for (const yagoEntity of yagoEntities) { const facts = await yagoResolver.getEntityFacts(yagoEntity.uri); yagoEnrichment.push({ entity: entity.name, yago_uri: yagoEntity.uri, type: yagoEntity.type, facts: facts }); } } } // Schema.org annotation let schemaAnnotation = null; if (include_schema) { schemaAnnotation = schemaMapper.generateAnnotation( { name: analysis.packageInfo?.name || 'Unknown', type: 'web-app' }, analysis ); } return { content: [{ type: 'text', text: JSON.stringify({ codebase_context: analysis, knowledge_graph: { yago_entities: yagoEnrichment, schema_annotations: schemaAnnotation }, enrichment_level, timestamp: new Date().toISOString() }, null, 2) }] }; } function extractEntities(analysis: any): Array<{name: string, type: string}> { const entities: Array<{name: string, type: string}> = []; // Extract from package dependencies if (analysis.packageInfo?.dependencies) { for (const dep of Object.keys(analysis.packageInfo.dependencies)) { entities.push({ name: dep, type: 'library' }); } } // Extract from detected languages if (analysis.languages) { for (const lang of Object.keys(analysis.languages)) { entities.push({ name: lang, type: 'programming-language' }); } } return entities; } ``` ### 4.2 Register New Tool **Update: `src/index.ts`** Add to the `ListToolsRequestSchema` handler: ```typescript { name: "enrich_context", description: "Enrich codebase context with YAGO knowledge graph and Schema.org annotations", inputSchema: { type: "object", properties: { path: { type: "string", description: "Root directory path" }, analysis_result: { type: "object", description: "Previous analysis result (optional)" }, enrichment_level: { type: "string", enum: ["minimal", "standard", "comprehensive"], description: "Level of semantic enrichment", default: "standard" }, include_yago: { type: "boolean", description: "Include YAGO entities", default: true }, include_schema: { type: "boolean", description: "Include Schema.org annotations", default: true } }, required: ["path"] } } ``` ## Success Metrics 1. **Entity Resolution**: >90% of common packages/frameworks linked to YAGO 2. **Query Performance**: <2s for SPARQL queries (with caching) 3. **Cache Hit Rate**: >80% for repeated entities 4. **Schema Coverage**: Support 20+ codebase types initially 5. **Fact Accuracy**: >95% of YAGO facts are relevant to context ## Testing Strategy ```typescript // Example test describe('YAGOResolver', () => { it('should resolve React to YAGO entity', async () => { const resolver = new YAGOResolver(db); const entities = await resolver.resolveEntity('React'); expect(entities.length).toBeGreaterThan(0); expect(entities[0].type).toContain('JavaScriptLibrary'); expect(entities[0].facts['schema:programmingLanguage']).toContain('JavaScript'); }); }); ``` ## Next Steps 1. ✅ Review Ludwig code patterns 2. 📋 Implement SPARQL client with fallback 3. 📋 Create YAGO resolver with caching 4. 📋 Build Schema.org mapper 5. 📋 Add enrich_context MCP tool 6. 📋 Write comprehensive tests 7. 📋 Optimize query performance 8. 📋 Document usage examples --- **Status**: Ready for implementation **Estimated Time**: 8 weeks **Risk Level**: Low (proven patterns from Ludwig)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mdz-axo/pt-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server