Uses PostgreSQL with pgvector extension for vector indexing, semantic similarity search, and full-text search with BM25 ranking of Salesforce metadata.
Provides advanced RAG capabilities for Salesforce metadata and code, enabling retrieval of Apex classes, triggers, validation rules, custom objects, profiles, permission sets, layouts, flows, and object schemas through intelligent chunking and vector search.
Salesforce Metadata-Aware RAG MCP
A Model Context Protocol (MCP) server that provides advanced RAG capabilities for Salesforce metadata and code, enabling AI copilots to understand your Salesforce org configuration through intelligent chunking and vector search.
Features
Core Salesforce Integration
Metadata API Integration: Access layouts, flows, custom objects, profiles, and permission sets
Tooling API Integration: Retrieve Apex classes, triggers, and validation rules
REST API Integration: Object schema descriptions and SOQL execution
Rate Limiting: Built-in API quota management and retry logic
Incremental Sync: Efficient updates for large orgs
Advanced RAG Capabilities
Intelligent Chunking: Metadata-aware chunking system that splits Apex classes by methods, objects by fields, etc.
Vector Indexing: PostgreSQL + pgvector for semantic similarity search
Keyword Search: Full-text search with BM25 ranking
Symbol Search: Exact matching for Salesforce objects, fields, and code symbols
Hybrid Search: Combined vector + keyword search with intelligent reranking
MCP Integration
Direct Claude Code Integration: Real-time Salesforce org exploration
Structured Metadata Access: Type-aware retrieval and processing
Symbol Extraction: Automatic discovery of relationships and dependencies
Available MCP Tools
sf_metadata_list- List metadata components of specified typessf_tooling_getApexClasses- Retrieve all Apex classes from the orgsf_describe_object- Describe a Salesforce object schemarag_status- Get system status and API usage stats
Development Setup
Prerequisites
Node.js 18+ and npm
Docker and Docker Compose (for PostgreSQL + pgvector)
Salesforce org access (sandbox recommended for testing)
Connected App or Username/Password authentication
Python 3.8+ with sentence-transformers (optional, for production embeddings)
Installation
Clone and install dependencies:
Configure Salesforce credentials in
Start PostgreSQL with pgvector:
Build the project:
Running the Server
MCP Server (for Claude Code):
Vector Integration Testing:
Type checking:
MCP Integration
To integrate with VS Code or Claude Desktop, add this configuration to your MCP settings:
For Claude Desktop (add to claude_desktop_config.json):
For Claude Code (add to .mcp.json in your workspace):
Adding to Claude Code MCP:
Create or update in your workspace root:
Alternative: Use Claude Code MCP command:
After setting up:
Restart Claude Code/Desktop to reload MCP configuration
Test MCP tools:
sf_describe_objectwith{"objectName": "Account"}sf_metadata_listwith{"types": ["ApexClass", "Layout"]}rag_statusto check system health
Project Structure
Environment Variables
Required for Salesforce connectivity:
SF_USERNAME- Salesforce usernameSF_PASSWORD- Salesforce passwordSF_SECURITY_TOKEN- Salesforce security tokenSF_LOGIN_URL- Login URL (https://login.salesforce.com or https://test.salesforce.com)
Optional configuration:
NODE_ENV- Environment (development/production)LOG_LEVEL- Logging level (debug/info/warn/error)PORT- Server port (default: 3000)DB_HOST- PostgreSQL host (default: localhost)DB_PORT- PostgreSQL port (default: 5433)DB_NAME- Database name (default: sfdxrag)DB_USER- Database user (default: postgres)DB_PASSWORD- Database password (default: postgres)
Architecture
Data Flow
Metadata Extraction: Retrieve Salesforce metadata via API clients
Intelligent Chunking: Process metadata using type-specific chunkers
Vector Indexing: Generate embeddings and store in PostgreSQL + pgvector
Search & Retrieval: Multi-modal search (vector + keyword + symbol)
Chunking System
The system includes specialized chunkers for different metadata types:
ApexChunker: Splits classes by methods, preserving signatures and docblocks
CustomObjectChunker: Splits objects by fields, validation rules, and metadata
GenericChunker: Fallback for unsupported types
Vector Search
Vector Search: Semantic similarity using sentence transformers
Keyword Search: Full-text search with BM25 ranking
Symbol Search: Exact matching for Salesforce symbols (objects, fields, classes)
Hybrid Search: Combined search with intelligent reranking (70% vector, 30% keyword)
Testing
Current Test Coverage
✅ Chunking System: Apex classes split into method-level chunks with symbol extraction
✅ Vector Storage: PostgreSQL + pgvector integration with batch operations
✅ Search Functions: Vector, keyword, symbol, and hybrid search working
✅ MCP Integration: Live Salesforce data retrieval and processing
✅ Symbol Detection: Automatic discovery of custom objects and dependencies
Example Results
From Apex class analysis:
Method-level chunking with separate chunks for class declaration and each method
Symbol extraction working for custom objects, standard objects, and system calls
Search functionality verified across all modes: semantic, keyword, symbol, hybrid
Production Deployment
For production use:
Configure real embedding models using
SentenceTransformerEmbeddingSet up persistent PostgreSQL instance with appropriate resource allocation
Configure proper authentication and security for multi-tenant access
Implement monitoring and performance optimization for large metadata volumes