Google Search MCP Server

MIT License
Overview InspectNew Schema Related Servers Reviews Score
mcp-web-search
cline_docs
# Google Search MCP - Active Context

## Current Implementation State

### Recently Completed
1. Search Service Enhancements
   - Improved Google Custom Search API integration
   - Added proper parameter handling for siteSearch
   - Enhanced snippet generation parameters
   - Implemented builder pattern for search options
   - Added comprehensive error handling
   - Improved domain filtering logic

2. Infrastructure Components
   - Connection Pool with browser management
   - Rate Limiter with token bucket algorithm
   - Cache Manager with TTL support
   - Logger with structured logging

3. Bot Detection Evasion
   - Enhanced browser fingerprinting
   - Improved navigator proxy implementation
   - Fixed webdriver property detection
   - Added proper userAgentData mocking
   - Enhanced plugin array simulation
   - Implemented natural mouse movements
   - Added variable scrolling behavior

### Current Focus
Working on extraction service improvements:
- Enhanced bot detection evasion techniques
- Improved browser fingerprint spoofing
- Natural user behavior simulation
- Advanced proxy-based property hiding

## Immediate Next Steps

1. Extraction Service Improvements (Priority: High)
   - Further enhance bot detection evasion
   - Improve human behavior simulation
   - Add more randomization to timing patterns
   - Enhance header randomization

2. Documentation (Priority: High)
   - Add JSDoc comments to all public methods
   - Create API documentation
   - Update README with setup instructions
   - Add example usage documentation

3. Error Handling (Priority: Medium)
   - Add retry mechanisms
   - Enhance error reporting
   - Implement circuit breakers
   - Add error recovery strategies

4. Performance Optimization (Priority: Medium)
   - Profile service performance
   - Optimize resource usage
   - Enhance caching strategies
   - Monitor memory consumption

## Active Considerations

### Extraction Service Implementation
- Bot detection evasion techniques
- Browser fingerprint management
- Human behavior simulation
- Resource cleanup strategies

### Resource Management
- Browser instance lifecycle
- Memory usage monitoring
- Cache size control
- Rate limit enforcement

### Error Handling
- Proper error propagation
- Detailed error messages
- Stack trace preservation
- Error recovery

### Module Resolution
- ESM compatibility
- Proper file extensions
- Import/export consistency
- Dependency management

## Recent Changes

1. Extraction Service Updates
   - Implemented advanced navigator proxy
   - Enhanced browser fingerprint spoofing
   - Improved human behavior simulation
   - Added natural mouse movements
   - Enhanced scrolling behavior
   - Improved header randomization

2. Infrastructure Updates
   - Fixed module resolution in connection pool
   - Enhanced rate limiter type safety
   - Improved cache manager error handling
   - Updated logger formatting

3. Configuration Changes
   - Updated TypeScript configuration for ESM
   - Enhanced ESLint rules
   - Added proper file extensions
   - Updated package.json scripts

## Potential Issues to Watch

1. Extraction Service
   - Monitor bot detection patterns
   - Watch for behavior fingerprinting
   - Track extraction success rates
   - Observe cache effectiveness

2. Error Handling
   - Monitor error recovery
   - Track unhandled rejections
   - Watch for memory leaks
   - Observe error patterns

3. Performance
   - Monitor response times
   - Track resource usage
   - Watch cache hit rates
   - Observe concurrent operations

## Development Notes

### Environment Setup
```bash
# Required environment variables
GOOGLE_API_KEY=required
GOOGLE_SEARCH_ENGINE_ID=required
MAX_CONCURRENT_BROWSERS=3
RATE_LIMIT_WINDOW=60000
RATE_LIMIT_MAX_REQUESTS=60

# Extraction Service Configuration
EXTRACTION_CACHE_TTL=3600
MAX_RETRIES=3
TIMEOUT_MS=30000
```

### Common Commands
```bash
# Development
npm run dev

# Testing
npm test

# Building
npm run build

# Linting
npm run lint
```

### Debug Notes
- Use --inspect flag for Node.js debugging
- Enable verbose logging with LOG_LEVEL=debug
- Monitor browser instances with connection pool stats
- Track rate limiting with debug logs
- Watch extraction service metrics with debug enabled

### Extraction Service Usage
```typescript
// Basic extraction
const result = await extractionService.extract({
  url: "https://example.com",
  includeImages: true
});

// Full page screenshot
const result = await extractionService.extract({
  url: "https://example.com",
  screenshot: {
    fullPage: true,
    format: "jpeg",
    quality: 80
  }
});

// Element-specific extraction
const result = await extractionService.extract({
  url: "https://example.com",
  screenshot: {
    selector: ".main-content",
    format: "png"
  }
});