Skip to main content
Glama

Captain Data MCP API

README.md9.51 kB
# Captain Data MCP API A Model Context Protocol (MCP) server for Captain Data tools, designed to work with ChatGPT and other AI assistants. This API provides a clean interface for LinkedIn data extraction and search capabilities through Captain Data's powerful scraping engine. ## Features - **MCP-compliant API** for seamless integration with AI assistants - **Dual authentication system** - API key headers or session tokens - **Comprehensive tool suite** for LinkedIn data extraction and search - **Production-ready** with proper error handling, logging, and monitoring - **TypeScript-first** with strict type checking and validation - **Fastify-powered** for high performance and extensibility - **Redis integration** for session management with in-memory fallback - **Interactive API documentation** with OpenAPI/Swagger - **Comprehensive testing** with Jest and integration tests ## Authentication The API supports two authentication methods: 1. **Direct API Key**: Set your Captain Data API key in the `X-API-Key` header 2. **Session Tokens**: Use the `/auth` endpoint to get a session token, then include it in the `Authorization: Bearer <token>` header ## Local Development ### Prerequisites - Node.js 16+ - npm or yarn - Captain Data API key - Redis (optional, falls back to in-memory storage) ### Setup 1. **Install dependencies:** ```bash npm install ``` 2. **Create environment configuration:** ```bash cp env.example .env # Edit .env with your Captain Data API key and other settings ``` 3. **Start the development server:** ```bash npm run dev ``` The server will be available at http://localhost:3000. ### Development Scripts ```bash # Start development server with hot reload npm run dev # Run tests npm test # Run tests with coverage npm run test:coverage # Type checking npm run type-check # Build for production npm run build # Generate OpenAPI documentation npm run generate:openapi npm run generate:openapi:gpt ``` ## Production Deployment ### Environment Variables | Variable | Required | Default | Description | |----------|----------|---------|-------------| | `CD_API_BASE` | ✅ | - | Captain Data API base URL | | `NODE_ENV` | ❌ | `development` | Environment (development/production/test) | | `PORT` | ❌ | `3000` | Server port | | `LOG_LEVEL` | ❌ | `info` | Logging level (error, warn, info, debug) | | `RATE_LIMIT_MAX` | ❌ | `100` | Maximum requests per time window | | `RATE_LIMIT_TIME_WINDOW` | ❌ | `1 minute` | Rate limit time window | | `API_TIMEOUT` | ❌ | `30000` | API request timeout in milliseconds | | `MAX_RETRIES` | ❌ | `2` | Maximum retry attempts for failed requests | | `RETRY_DELAY` | ❌ | `1000` | Delay between retries in milliseconds | | `REDIS_URL` | ❌ | - | Redis connection URL for session storage | | `SENTRY_DSN` | ❌ | - | Sentry DSN for error tracking | ### Production Checklist Before deploying to production, ensure: - [ ] Environment variables are properly configured - [ ] All tests pass: `npm test` - [ ] Type checking passes: `npm run type-check` - [ ] Rate limiting is configured appropriately - [ ] Logging level is set to `info` or `warn` - [ ] Health check endpoint is monitored - [ ] Error tracking is set up (Sentry recommended) - [ ] Redis is configured for session management (optional) ### Monitoring & Observability The API provides comprehensive monitoring capabilities: - **Health Check**: `GET /health` - Returns system status and uptime - **Request Logging**: All requests are logged with unique request IDs - **Error Tracking**: Structured error responses with detailed context - **Performance Metrics**: Response times and execution metrics - **Sentry Integration**: Automatic error reporting and performance monitoring ## API Endpoints ### Core Endpoints - `GET /health` - Health check and system status - `GET /introspect` - List available tools (basic mode) - `GET /introspect?v=full` - List all available tools (full mode) - `POST /auth` - Generate session token from API key - `POST /tools/:alias` - Execute a specific tool ### Documentation - `GET /docs` - Interactive API documentation (Swagger UI) - `GET /docs/json` - OpenAPI specification - `GET /openapi.json` - Raw OpenAPI specification - `GET /openapi.gpt.json` - GPT-compatible OpenAPI specification ## Available Tools ### LinkedIn Profile Tools - **`enrich_people`** - Extract detailed profile information from LinkedIn URLs - Perfect for lead research and candidate evaluation - Returns comprehensive profile data including experience, skills, and contact info - **`enrich_company`** - Get comprehensive company data from LinkedIn company pages - Ideal for market research and competitive analysis - Provides company details, employee count, industry, and more ### Sales Navigator Search Tools - **`search_people`** - Find prospects using Sales Navigator with advanced filtering - Search by location, industry, company size, and other criteria - Returns filtered lists of potential leads - **`search_companies`** - Discover target companies based on various criteria - Filter by industry, size, location, and other parameters - Perfect for B2B prospecting and market analysis - **`search_company_employees`** - Identify key contacts within specific organizations - Find employees by company URL or LinkedIn company ID - Great for account-based marketing and sales outreach ## Using with AI Assistants ### ChatGPT Integration 1. **Deploy to Vercel:** ```bash npm i -g vercel vercel login vercel ``` 2. **Configure in ChatGPT:** - Use the deployed URL: `https://your-project.vercel.app` - The API will be available for ChatGPT Actions ### Custom GPT Authentication Workaround **Important**: Custom GPTs have limitations with authorization headers. To work around this, the API supports authentication via query parameters: ``` POST /tools/enrich_people?session_token=your_session_token ``` **Note**: Query parameter authentication is provided as a workaround for Custom GPTs. For production applications, prefer header-based authentication for better security. ### MCP Protocol Support The API is fully compatible with the Model Context Protocol (MCP), providing: - Standardized tool introspection - Consistent error handling - Session-based authentication - Structured request/response formats ## Project Architecture ``` src/ ├── api/ # API layer │ ├── handlers/ # Request handlers │ │ ├── auth.ts # Authentication endpoint │ │ ├── health.ts # Health check │ │ ├── introspect.ts # Tool introspection │ │ └── tools.ts # Tool execution │ ├── routes.ts # Route registration │ └── schemas/ # Request/response schemas ├── lib/ # Core business logic │ ├── alias.ts # Tool alias mappings │ ├── auth.ts # Authentication logic │ ├── config.ts # Configuration management │ ├── error.ts # Error handling utilities │ ├── redis.ts # Redis service │ ├── schemas/ # Tool parameter schemas │ ├── translate.ts # API translation layer │ └── responseSchemas.ts # Response schemas ├── middleware/ # Request middleware │ ├── logging.ts # Request logging │ └── security.ts # Security middleware ├── server/ # Server configuration │ ├── build.ts # Server builder │ └── start.ts # Development server └── scripts/ # Build scripts ├── generate-openapi.ts └── generate-openapi-gpt.ts ``` ## Testing The project includes comprehensive testing: ```bash # Run all tests npm test # Run specific test suites npm run test:health npm run test:auth npm run test:tools npm run test:integration # Run tests with coverage npm run test:coverage ``` ### Test Structure - **Unit Tests**: Individual function and module testing - **Integration Tests**: Full API workflow testing - **Authentication Tests**: Session token and API key validation - **Error Handling Tests**: Comprehensive error scenario coverage ## Error Handling The API provides structured error responses with: - **Unique request IDs** for tracking - **Standardized error codes** for programmatic handling - **Detailed error messages** with context - **HTTP status codes** following REST conventions - **Sentry integration** for error tracking in production ## Performance & Scalability - **Fastify framework** for high-performance request handling - **Redis integration** for scalable session management - **Rate limiting** to prevent abuse - **Request timeouts** and retry logic - **Connection pooling** for external API calls - **Graceful degradation** with in-memory fallbacks ## Security Features - **Input validation** with JSON schemas - **Rate limiting** to prevent abuse - **Security headers** (CORS, XSS protection, etc.) - **Authentication validation** for all protected endpoints - **Request logging** with sensitive data redaction - **Error message sanitization** to prevent information leakage ## Contributing 1. Fork the repository 2. Create a feature branch 3. Make your changes 4. Add tests for new functionality 5. Ensure all tests pass 6. Submit a pull request ## License MIT License - see [LICENSE](LICENSE) file for details.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/captaindatatech/captaindata-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server