PyGithub MCP Server

  • docs
# System Patterns > **Note:** This document provides detailed implementation patterns and best practices for the PyGithub MCP Server. For a high-level overview of the technology stack and architecture, please refer to [`tech_context.md`](tech_context.md). ## System Architecture Overview This diagram provides a comprehensive view of how the various architectural decisions work together in the PyGithub MCP Server: ```mermaid flowchart TD subgraph "MCP Server" direction TB subgraph "Tools Layer (ADR-006)" ToolReg["Tool Registration System"] IssueTools["Issue Tools"] RepoTools["Repository Tools"] PRTools["PR Tools"] Config["Configuration System"] ToolReg --> IssueTools ToolReg --> RepoTools ToolReg --> PRTools Config --> ToolReg end subgraph "Operations Layer" Issues["Issue Operations"] Repos["Repository Operations"] PRs["PR Operations"] IssueTools --> Issues RepoTools --> Repos PRTools --> PRs end subgraph "Client Layer (ADR-001)" Client["GitHub Client"] RateLimit["Rate Limit Handling"] Issues --> Client Repos --> Client PRs --> Client Client --> RateLimit end PyGithub["PyGithub Library"] Client --> PyGithub end GitHub["GitHub API"] PyGithub --> GitHub subgraph "Schema Layer (ADR-003, ADR-004)" direction TB BaseModels["Base Models"] IssueModels["Issue Models"] RepoModels["Repository Models"] PRModels["PR Models"] Validation["Field Validators"] BaseModels --- IssueModels BaseModels --- RepoModels BaseModels --- PRModels Validation -.- IssueModels Validation -.- RepoModels Validation -.- PRModels end IssueTools <-.-> IssueModels RepoTools <-.-> RepoModels PRTools <-.-> PRModels Issues <-.-> IssueModels Repos <-.-> RepoModels PRs <-.-> PRModels subgraph "Data Transformation (ADR-005)" direction TB IssueConv["Issue Converters"] RepoConv["Repository Converters"] UserConv["User Converters"] CommonConv["Common Converters"] IssueConv --> CommonConv RepoConv --> CommonConv UserConv --> CommonConv end Client --> IssueConv Client --> RepoConv Client --> UserConv subgraph "Testing (ADR-002)" direction TB UnitTests["Unit Tests"] IntegTests["Integration Tests"] TestData["Test Data Classes"] UnitTests --- TestData IntegTests --> GitHub end style IssueModels fill:#f9f,stroke:#333,stroke-width:2px style RepoModels fill:#f9f,stroke:#333,stroke-width:2px style PRModels fill:#f9f,stroke:#333,stroke-width:2px style BaseModels fill:#f9f,stroke:#333,stroke-width:2px style Client fill:#bbf,stroke:#333,stroke-width:2px style PyGithub fill:#bbf,stroke:#333,stroke-width:2px style Issues fill:#bfb,stroke:#333,stroke-width:2px style Repos fill:#bfb,stroke:#333,stroke-width:2px style PRs fill:#bfb,stroke:#333,stroke-width:2px style IssueTools fill:#fbb,stroke:#333,stroke-width:2px style RepoTools fill:#fbb,stroke:#333,stroke-width:2px style PRTools fill:#fbb,stroke:#333,stroke-width:2px style ToolReg fill:#fbb,stroke:#333,stroke-width:2px style Config fill:#fbb,stroke:#333,stroke-width:2px ``` ### Request Flow (ADR-007: Pydantic-First Architecture) This diagram shows the end-to-end flow of a request through the system, highlighting how the Pydantic-First Architecture works: ```mermaid sequenceDiagram participant MCP as MCP Server participant Tool as Tools Layer participant Op as Operations Layer participant Client as Client Layer participant PyGithub as PyGithub participant GitHub as GitHub API MCP->>Tool: JSON Request Note over Tool: Convert JSON to<br/>Pydantic Model Tool->>Op: Pass Pydantic Model Note over Op: Validation already<br/>happened during<br/>model instantiation Op->>Client: Pass validated model Client->>PyGithub: PyGithub API call PyGithub->>GitHub: HTTP Request GitHub-->>PyGithub: HTTP Response PyGithub-->>Client: PyGithub Object Note over Client: Convert PyGithub Object<br/>to response schema Client-->>Op: Return schema data Op-->>Tool: Return data Note over Tool: Format as MCP Response Tool-->>MCP: JSON Response Note right of Tool: ADR-006:<br/>Modular Tool Architecture Note right of Op: ADR-007:<br/>Pydantic-First Architecture Note right of Client: ADR-001:<br/>PyGithub Integration Note right of PyGithub: ADR-002:<br/>Real API Testing ``` ## Core Architecture ### Pydantic-First Architecture ```mermaid flowchart TD A[Tools Layer] -->|Pydantic Models| B[Operations Layer] B -->|Pydantic Models| C[Client Layer] C --> D[GitHub API] subgraph Schema Layer E[Pydantic Models] F[Validation Logic] end E --> A E --> B F --> E ``` ### GitHub Integration ```mermaid flowchart TD A[FastMCP Server] --> B[GitHub Client] B --> C[PyGithub] C --> D[GitHub API] subgraph Client Layer B E[Object Conversion] F[Rate Limiting] G[Error Handling] end B --> E B --> F B --> G ``` ### Component Relationships 1. GitHub Client (Singleton) - Manages PyGithub instance - Handles authentication - Provides conversion utilities - Manages rate limiting - Centralizes error handling 2. Operation Modules - Use GitHub client for API interactions - Accept Pydantic models directly (ADR-007) - Maintain consistent patterns - Focus on specific domains - Handle pagination 3. Schema Layer - Models based on PyGithub objects - Organized by domain (ADR-003) - Enhanced validation (ADR-004) - Clear type definitions - Documented relationships 4. Tools Layer - Organized by domain (ADR-006) - Configuration-driven registration - Decorator-based tool system - Consistent error handling - Clean MCP protocol interface ## Best Practices ### 1. Pydantic Model Usage - Define all input parameters as Pydantic models - Let Pydantic handle validation at model instantiation - Pass models directly between layers - Define field validators for special validation needs - Use strict=True to prevent unwanted type coercion ### 2. Tool Implementation - Organize tools by domain (issues, repositories, etc.) - Use the @tool() decorator for registration - Register tools through the module's register function - Consistent error handling across all tools - Clear parameter validation through Pydantic models ### 3. Error Handling - Use GitHubError for all client-facing errors - Provide clear error messages with context - Handle rate limits with exponential backoff - Include resource information in error messages - Consistent formatting across all error types ### 4. Testing - Use real API interactions instead of mocks for integration tests - Use dataclasses instead of MagicMock for unit tests - Focus on testing behavior rather than implementation - Implement proper cleanup for test resources - Tag resources created during tests for identification - Separate unit tests from integration tests by directory structure - Mark integration tests with @pytest.mark.integration - Test both success and error paths - Verify that tests remain isolated and don't affect each other - Use dependency injection for easier test parameterization ### 5. Configuration - Use environment variables for deployment configuration - Provide sensible defaults for all settings - Clear precedence rules for configuration sources - Document all configuration options - Support selective tool group enabling/disabling ### 6. Optional Parameter Handling - Only include parameters in kwargs when they have non-None values - Convert primitive types to PyGithub objects before passing (e.g., milestone number → Milestone object) - Handle object conversion errors explicitly - Document parameter requirements in docstrings - Test with various parameter combinations ## Implementation Patterns ### 1. Pydantic-First Operations ```python # Pattern for operations layer with Pydantic-First architecture from typing import List, Dict, Any from ..schemas.issues import ListIssuesParams from ..client import GitHubClient from github import GithubException def list_issues(params: ListIssuesParams) -> List[Dict[str, Any]]: """List issues in a repository. Args: params: Validated parameters for listing issues Returns: List of issues from GitHub API Raises: GitHubError: If the API request fails or validation fails """ try: client = GitHubClient.get_instance() repository = client.get_repo(f"{params.owner}/{params.repo}") # Build kwargs from Pydantic model kwargs = {"state": params.state or 'open'} # Add optional parameters only if provided if params.sort: kwargs["sort"] = params.sort if params.direction: kwargs["direction"] = params.direction if params.since: kwargs["since"] = params.since # Get paginated issues and handle pagination paginated_issues = repository.get_issues(**kwargs) issues = get_paginated_items(paginated_issues, params.page, params.per_page) # Convert each issue to our schema return [convert_issue(issue) for issue in issues] except GithubException as e: raise client._handle_github_exception(e) ``` ### 2. Pydantic-First Tools ```python # Pattern for tools layer with Pydantic-First architecture from ..schemas.issues import ListIssuesParams from pygithub_mcp_server.tools import tool from pygithub_mcp_server.operations import issues from pygithub_mcp_server.errors import GitHubError, format_github_error @tool() def list_issues(params: ListIssuesParams) -> dict: """List issues from a GitHub repository. Args: params: Parameters for listing issues Returns: List of issues from GitHub API """ try: # Pass the validated Pydantic model directly to operations result = issues.list_issues(params) return {"content": [{"type": "text", "text": json.dumps(result, indent=2)}]} except GitHubError as e: return { "content": [{"type": "error", "text": format_github_error(e)}], "is_error": True } ``` ### 3. Tool Registration and Configuration ```python # In tools/issues/__init__.py def register(mcp): """Register all issue tools with the MCP server.""" from pygithub_mcp_server.tools import register_tools from .tools import create_issue, list_issues, get_issue, update_issue register_tools(mcp, [ create_issue, list_issues, get_issue, update_issue, # Other issue tools ]) # In server.py from pygithub_mcp_server.config import load_config from pygithub_mcp_server.tools import load_tools def create_server(): """Create and configure the MCP server.""" # Create FastMCP server instance mcp = FastMCP( "pygithub-mcp-server", version=VERSION, description="GitHub API operations via MCP" ) # Load configuration config = load_config() # Load and register tools based on configuration load_tools(mcp, config) return mcp ``` ### 4. Validation Error Handling ```python # Pattern for consistent validation error handling import functools from pydantic import ValidationError from .github import GitHubError def validation_error_to_github_error(func): """Decorator to convert Pydantic ValidationError to GitHubError.""" @functools.wraps(func) def wrapper(*args, **kwargs): try: return func(*args, **kwargs) except ValidationError as e: errors = e.errors() if errors: field = errors[0].get('loc', ['unknown'])[0] message = errors[0].get('msg', 'Invalid value') error_msg = f"Validation error: {field} - {message}" else: error_msg = "Invalid input data" raise GitHubError(error_msg) return wrapper ``` ### 5. Schema Conversion ```python # Pattern for object conversion def convert_issue(issue): """Convert a PyGithub Issue object to our schema format.""" return { "number": issue.number, "title": issue.title, "body": issue.body, "state": issue.state, "created_at": issue.created_at.isoformat(), "updated_at": issue.updated_at.isoformat(), "user": { "login": issue.user.login, "id": issue.user.id, "url": issue.user.html_url }, "labels": [{"name": label.name, "color": label.color} for label in issue.labels], "comments": issue.comments, "url": issue.html_url } ``` ### 6. Error Handling ```python # Pattern for error handling try: github_obj = client.operation() return convert_github_object(github_obj) except GithubException as e: raise GitHubError(str(e)) ``` #### Error Types and Handling The system defines several specific error types to provide clear, actionable feedback: 1. **GitHubResourceNotFoundError (Status Code: 404)** - Indicates the requested resource does not exist - Common scenarios: Repository not found, issue not found, comment not found - Example message: "Issue not found" or "Repository not found" 2. **GitHubAuthenticationError (Status Code: 401)** - Authentication failed or token is invalid - Common scenarios: Invalid token, expired token, token lacks required scopes - Example message: "Authentication failed. Please verify your GitHub token." 3. **GitHubPermissionError (Status Code: 403)** - User lacks permission for the requested operation - Common scenarios: Insufficient repository permissions, organization access required - Example message: "You don't have permission to perform this operation." 4. **GitHubRateLimitError (Status Code: 403 with rate limit headers)** - API rate limit exceeded - Includes reset time in error object - Example message: "API rate limit exceeded. Please wait before making more requests." - Usage example: ```python except GitHubRateLimitError as e: print(f"Rate limit exceeded. Resets at: {e.reset_at}") ``` 5. **GitHubValidationError (Status Code: 422)** - Request validation failed - Common scenarios: Invalid field values, missing required fields - Example format: ``` Validation failed: - title: cannot be blank (missing_field) - labels: invalid format (invalid) ``` 6. **GitHubError (Base error type)** - Base error type for unhandled or unexpected errors - Common scenarios: Network issues, server errors, unexpected API responses #### Error Handling Best Practices 1. **Catch specific error types first:** ```python try: # GitHub operation except GitHubResourceNotFoundError: # Handle 404 except GitHubValidationError: # Handle validation except GitHubError: # Handle other errors ``` 2. **Check error response data for additional context:** ```python except GitHubError as e: if e.response: # Handle with context else: # Handle generic error ``` 3. **Handle rate limits gracefully:** ```python except GitHubRateLimitError as e: wait_time = e.reset_at - datetime.now() logger.warning(f"Rate limit hit. Waiting {wait_time}") # Implement backoff strategy ``` 4. **Resource Type Detection:** The error handler automatically detects the type of resource from the error message or response data: ```python if "issue" in error_msg.lower(): resource_type = "issue" elif "repository" in error_msg.lower(): resource_type = "repository" ``` ### 7. Pagination Handling ```python # In converters/common/pagination.py def get_paginated_items(paginated_list, page=None, per_page=None): """Get items from a PyGithub PaginatedList with pagination support. Args: paginated_list: PyGithub PaginatedList object page: Optional page number (1-based) per_page: Optional items per page Returns: List of items from the paginated list """ if page is not None and per_page is not None: # Use both page and per_page for precise pagination start = (page - 1) * per_page end = start + per_page try: return list(paginated_list[start:end]) except IndexError: # Handle case where start is beyond the list length return [] elif page is not None: # Use default per_page value (30) with specified page try: return paginated_list.get_page(page - 1) except IndexError: return [] elif per_page is not None: # Get just the first per_page items try: return list(paginated_list[:per_page]) except IndexError: return [] else: # No pagination, get all items (use with caution!) return list(paginated_list) ``` ## Security Considerations ### Authentication 1. **Token Security** - Personal Access Tokens (PATs) should never be committed to source control - Set token via `GITHUB_PERSONAL_ACCESS_TOKEN` environment variable - Use fine-grained tokens with minimal required permissions - Example: ```python # Set token securely via environment token = os.getenv("GITHUB_PERSONAL_ACCESS_TOKEN") if not token: raise GitHubError("Token not configured") ``` 2. **Token Permissions** - Repository permissions determine available operations: - Read: View issues, comments, labels - Write: Create/update issues, add labels - Admin: Manage repository settings - Operations fail with `GitHubPermissionError` if permissions are insufficient ### Access Control 1. **Repository Access** - Private repositories return 404 "Not Found" instead of 401/403 - This is a GitHub security feature to prevent repository enumeration - Same response whether repository doesn't exist or user lacks access - Helps prevent information disclosure about private repositories 2. **Rate Limiting** - Rate limits help prevent abuse - Limits tracked per token/IP - Secondary rate limits may apply - Rate limit errors include reset time - Implement retry logic with backoff ### Content Security 1. **Content Sanitization** - GitHub automatically sanitizes HTML content - Script tags are removed - javascript: URLs are blocked - HTML is rendered as markdown 2. **Input Validation** - Use schema validation to enforce constraints - Invalid input results in `GitHubValidationError` ### Security Logging Important events to log: - Authentication failures - Permission denied errors - Rate limit hits - Invalid access attempts - Content validation failures Example logging: ```python # Authentication failure logger.error("Authentication failed", extra={ "token_prefix": token[:4] if token else None, "error": str(e) }) # Permission denied logger.warning("Permission denied", extra={ "operation": operation_name, "resource": resource_id }) # Rate limit logger.debug("Rate limit hit", extra={ "reset_at": e.reset_at, "operation": operation_name }) ``` ## System Flow ### Operation Flow ```mermaid sequenceDiagram participant MCP as FastMCP Server participant Client as GitHub Client participant PyGithub participant API as GitHub API MCP->>Client: Operation Request Client->>Client: Build kwargs Client->>Client: Convert Types Client->>PyGithub: Get/Create Object PyGithub->>API: API Request API-->>PyGithub: Response PyGithub-->>Client: PyGithub Object Client->>Client: Convert to Schema Client-->>MCP: Schema Response ``` ### Error Flow ```mermaid sequenceDiagram participant MCP as FastMCP Server participant Client as GitHub Client participant PyGithub participant API as GitHub API MCP->>Client: Operation Request Client->>PyGithub: Get/Create Object PyGithub->>API: API Request API-->>PyGithub: Error Response PyGithub-->>Client: GitHub Exception Client->>Client: Convert to GitHubError Client-->>MCP: Error Response ``` ## Design Patterns ### 1. Singleton Pattern (GitHub Client) ```python class GitHubClient: _instance = None @classmethod def get_instance(cls): if cls._instance is None: cls._instance = cls() return cls._instance ``` ### 2. Factory Pattern (Object Conversion) ```python class GitHubObjectFactory: @staticmethod def create_from_github_object(obj): if isinstance(obj, github.Issue.Issue): return convert_issue(obj) elif isinstance(obj, github.Repository.Repository): return convert_repository(obj) elif isinstance(obj, github.PullRequest.PullRequest): return convert_pull_request(obj) # ... other object types else: raise ValueError(f"Unsupported GitHub object type: {type(obj)}") ``` ### 3. Strategy Pattern (Error Handling) ```python class ErrorHandler: def handle_error(self, error): if isinstance(error, RateLimitExceededException): return handle_rate_limit(error) elif isinstance(error, UnknownObjectException): return handle_not_found(error) elif isinstance(error, GithubException): if error.status == 403: return handle_permission_error(error) # ... other status codes return handle_unknown_error(error) ``` ### 4. Decorator Pattern (Tool Registration) ```python def tool(): """Decorator to register a function as an MCP tool.""" def decorator(func): func._is_tool = True return func return decorator ``` ## Testing Patterns ### 1. Testing Philosophy and Principles Based on ADR 002, our testing approach follows these core principles: - **No Mocks for Integration Tests**: Use real API interactions instead of mock objects for accurate behavior verification - **Dataclasses for Unit Tests**: Use Python's dataclasses instead of MagicMock for cleaner, type-safe test objects - **Behavior-Focused Testing**: Test what functions do, not how they do it - **Isolated Tests**: Each test should be independent and not affect others - **Test Coverage Prioritization**: Focus on high-risk and critical paths first Testing follows a layer-based structure that mirrors the application architecture: ``` tests/ ├── unit/ # Fast tests with no external dependencies │ ├── client/ # Tests for client module │ ├── config/ # Tests for configuration │ ├── converters/ # Tests for converters │ ├── schemas/ # Tests for schema validation │ └── utils/ # Tests for utility functions └── integration/ # Tests using the real GitHub API ├── client/ # Tests for client with real API ├── operations/ # Tests for API operations with real endpoints │ ├── issues/ # Tests for issue operations │ ├── repositories/ # Tests for repository operations │ └── users/ # Tests for user operations └── tools/ # Tests for MCP tools with real API ``` ### 2. Unit Testing with Dataclasses In accordance with ADR 002, we use Python's dataclasses instead of MagicMock objects for cleaner, more maintainable tests: ```python from dataclasses import dataclass @dataclass class RepositoryOwner: login: str id: int = 12345 html_url: str = "https://github.com/test-user" @dataclass class Repository: id: int name: str full_name: str owner: RepositoryOwner private: bool = False html_url: str = "https://github.com/test-user/test-repo" description: str = None def test_convert_repository(): # Given owner = RepositoryOwner(login="test-user") repo = Repository( id=98765, name="test-repo", full_name="test-user/test-repo", owner=owner ) # When result = convert_repository(repo) # Then assert result["id"] == 98765 assert result["name"] == "test-repo" assert result["owner"]["login"] == "test-user" ``` Benefits of this approach include: - Type safety with IDE autocomplete - No unexpected attribute creation - Clear object structure that mirrors real objects - Better representation in test failure output - Prevention of bugs from typos in attribute names ### 3. Integration Testing with Real API For integration tests, we interact with the actual GitHub API: ```python @pytest.mark.integration def test_create_issue_integration(test_owner, test_repo, test_cleanup): """Test creating an issue in a real GitHub repository.""" # Generate a unique identifier for this test test_id = str(uuid.uuid4())[:8] title = f"Test Issue {test_id}" body = f"This is a test issue created by integration tests - {test_id}" # Create the Pydantic model (already validated) params = CreateIssueParams( owner=test_owner, repo=test_repo, title=title, body=body ) # Call the operation (no mocks) result = issues.create_issue(params) # Register for cleanup test_cleanup.add_issue(test_owner, test_repo, result["number"]) # Assertions against real API response assert result["title"] == title assert result["body"] == body assert result["state"] == "open" assert "number" in result # Verify the issue was actually created by fetching it verification = issues.get_issue(GetIssueParams( owner=test_owner, repo=test_repo, issue_number=result["number"] )) assert verification["title"] == title ``` ### 4. Test Fixtures and Helpers We use pytest fixtures to create test data and manage resources: ```python @pytest.fixture def test_owner(): """Get the GitHub owner for test operations.""" return os.environ.get("GITHUB_TEST_OWNER", "test-owner") @pytest.fixture def test_repo(): """Get the GitHub repository for test operations.""" return os.environ.get("GITHUB_TEST_REPO", "test-repo") @pytest.fixture def test_cleanup(): """Fixture to track and clean up test resources.""" cleanup = TestCleanup() yield cleanup cleanup.cleanup_all() class TestCleanup: """Helper to track and clean up test resources.""" def __init__(self): self.issues = [] self.comments = [] # Other resource types... def add_issue(self, owner, repo, issue_number): """Track an issue for cleanup.""" self.issues.append((owner, repo, issue_number)) def cleanup_all(self): """Clean up all tracked resources.""" client = GitHubClient.get_instance() # Clean up issues for owner, repo, issue_number in self.issues: try: repository = client.get_repo(f"{owner}/{repo}") issue = repository.get_issue(issue_number) if not issue.closed: issue.edit(state="closed") except Exception as e: logger.warning(f"Failed to clean up issue {owner}/{repo}#{issue_number}: {e}") # Clean up other resource types... ``` ### 5. Context Managers for Environment Testing For testing environment-dependent code like command-line interfaces: ```python @contextmanager def capture_stdout(): """Capture stdout for testing.""" new_stdout = StringIO() old_stdout = sys.stdout sys.stdout = new_stdout try: yield new_stdout finally: sys.stdout = old_stdout def test_main_function(): """Test main function output.""" with capture_stdout() as stdout: main(["--version"]) output = stdout.getvalue() assert "version" in output.lower() ``` ### 6. Rate Limit Handling in Tests To handle GitHub API rate limits during tests: ```python @pytest.mark.integration def test_rate_limited_operation(test_owner, test_repo): """Test operation with retry for rate limits.""" max_retries = 3 retry_count = 0 while retry_count < max_retries: try: params = ListIssuesParams(owner=test_owner, repo=test_repo) result = issues.list_issues(params) # Test passed break except GitHubError as e: if "rate limit exceeded" in str(e).lower() and retry_count < max_retries - 1: # Calculate backoff time (exponential) wait_time = 2 ** retry_count * 5 # 5, 10, 20 seconds logger.warning(f"Rate limited, retrying in {wait_time} seconds...") time.sleep(wait_time) retry_count += 1 else: # Either not a rate limit error or we've used all retries raise # Perform test assertions assert isinstance(result, list) ``` ### 7. Maintainable Test Strategies Write tests that remain valid even as the codebase evolves: ```python # BAD - Hardcoded expectations that break when defaults change assert config["tool_groups"]["repositories"]["enabled"] is False # GOOD - Dynamic checks that adapt to changes in defaults for group, settings in DEFAULT_CONFIG["tool_groups"].items(): assert config["tool_groups"][group]["enabled"] == settings["enabled"] ``` Always ensure mock functions match the actual function signatures: ```python # BAD - Outdated mock that doesn't match the real method signature def get_repository_mock(params): # Expecting a Pydantic model # But the real function expects: get_repository(owner, repo) # This will fail with: TypeError: ... takes 1 positional argument but 2 were given # GOOD - Correctly matching the function signature def get_repository_mock(owner, repo): # Matches real signature # Will work correctly with the actual implementation ``` ### 8. Testing Error Conditions Test both success and error paths: ```python def test_nonexistent_repository(): """Test behavior when repository doesn't exist.""" params = ListIssuesParams( owner="non-existent-user-123456", # Unlikely to exist repo="non-existent-repo-123456" ) with pytest.raises(GitHubError) as exc_info: issues.list_issues(params) assert "not found" in str(exc_info.value).lower() assert exc_info.value.status == 404 ``` ### 8. Test Tagging and Identification Identify resources created by tests for easier tracking and cleanup: ```python def create_test_issue(owner, repo, test_id): """Create an issue with a unique test identifier.""" params = CreateIssueParams( owner=owner, repo=repo, title=f"Test Issue {test_id}", body=f"Test issue created by automated tests. Test ID: {test_id}" ) return issues.create_issue(params) ``` ### 9. Parameterized Tests Use pytest's parameterize feature for testing multiple scenarios: ```python @pytest.mark.parametrize("state", ["open", "closed", "all"]) def test_list_issues_with_different_states(test_owner, test_repo, state): """Test listing issues with different state filters.""" params = ListIssuesParams( owner=test_owner, repo=test_repo, state=state ) result = issues.list_issues(params) if state == "all": # Should include both open and closed issues assert any(issue["state"] == "open" for issue in result) or \ any(issue["state"] == "closed" for issue in result) else: # All issues should match the requested state assert all(issue["state"] == state for issue in result) ``` ### 10. Resource Management in Tests Implement robust resource management for tests to prevent test pollution: ```python @pytest.fixture(scope="session") def test_environment(): """Set up test environment with resources that all tests can use.""" # Create session-level resources client = GitHubClient.get_instance() test_owner = os.environ.get("GITHUB_TEST_OWNER") test_repo = os.environ.get("GITHUB_TEST_REPO") # Set up test data test_resources = {} # Create a test issue that can be reused repository = client.get_repo(f"{test_owner}/{test_repo}") test_issue = repository.create_issue( title="Test Issue for Integration Tests", body="This is a persistent test issue used for integration tests." ) test_resources["issue_number"] = test_issue.number # Return the resources yield test_resources # We don't clean up session-level resources here because they are reused # across multiple test runs ``` ### 11. Testing with Large Repositories When testing operations on repositories with many issues, it's important to use pagination to avoid performance problems: ```python # Before (problematic with large repos): issues = list_issues(ListIssuesParams( owner=owner, repo=repo, state="closed" )) # After (works efficiently with large repos): issues = list_issues(ListIssuesParams( owner=owner, repo=repo, state="closed", per_page=20, # Limit results to avoid hanging page=1 # Only get first page )) ``` This change ensures tests run efficiently even in repositories with hundreds or thousands of issues. ### 12. API Response Time Testing Test API response times to ensure proper handling: ```python @pytest.mark.integration @pytest.mark.slow def test_list_issues_performance(test_owner, test_repo): """Test that listing issues completes within a reasonable time.""" params = ListIssuesParams( owner=test_owner, repo=test_repo, state="all", # Fetch all issues to test performance with larger data set per_page=20, # Limit results to avoid hanging with large repos page=1 # Only get first page ) start_time = time.time() results = issues.list_issues(params) end_time = time.time() # Timing assertion: Operation should complete in under 5 seconds assert end_time - start_time < 5.0 # Results should be a list assert isinstance(results, list) # Note: Always include pagination parameters when testing with repositories that may # have a large number of issues (e.g., 400+). Without pagination, the code may # attempt to retrieve all matching issues at once, causing tests to hang or timeout. ``` ### 12. CI/CD Test Configuration For CI/CD pipelines, configure tests to run efficiently: ```yaml # Example GitHub Actions configuration name: Run Tests on: push: branches: [ main ] pull_request: branches: [ main ] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Set up Python uses: actions/setup-python@v2 with: python-version: '3.11' - name: Install dependencies run: | python -m pip install --upgrade pip pip install -e ".[dev]" - name: Run unit tests run: pytest tests/unit/ -v - name: Run integration tests if: github.event_name == 'push' && github.ref == 'refs/heads/main' run: pytest tests/integration/ -v --run-integration env: GITHUB_TEST_TOKEN: ${{ secrets.GITHUB_TEST_TOKEN }} GITHUB_TEST_OWNER: ${{ secrets.GITHUB_TEST_OWNER }} GITHUB_TEST_REPO: ${{ secrets.GITHUB_TEST_REPO }} ``` ### 13. Testing Lifecycle Actions For comprehensive testing of resource lifecycles: ```python @pytest.mark.integration def test_issue_lifecycle(test_owner, test_repo, test_cleanup): """Test the full lifecycle of an issue.""" # 1. Create an issue test_id = str(uuid.uuid4())[:8] create_params = CreateIssueParams( owner=test_owner, repo=test_repo, title=f"Lifecycle Test Issue {test_id}", body=f"Testing complete issue lifecycle - {test_id}" ) created = issues.create_issue(create_params) test_cleanup.add_issue(test_owner, test_repo, created["number"]) # 2. Get the issue get_params = GetIssueParams( owner=test_owner, repo=test_repo, issue_number=created["number"] ) retrieved = issues.get_issue(get_params) assert retrieved["number"] == created["number"] # 3. Update the issue update_params = UpdateIssueParams( owner=test_owner, repo=test_repo, issue_number=created["number"], title=f"Updated Lifecycle Test Issue {test_id}", state="closed" ) updated = issues.update_issue(update_params) assert updated["title"] == update_params.title assert updated["state"] == "closed" # 4. Add a comment comment_params = IssueCommentParams( owner=test_owner, repo=test_repo, issue_number=created["number"], body=f"Test comment for issue lifecycle - {test_id}" ) comment = issues.create_issue_comment(comment_params) test_cleanup.add_comment(test_owner, test_repo, created["number"], comment["id"]) # 5. Verify comment assert comment["body"] == comment_params.body # 6. Re-open issue reopen_params = UpdateIssueParams( owner=test_owner, repo=test_repo, issue_number=created["number"], state="open" ) reopened = issues.update_issue(reopen_params) assert reopened["state"] == "open" ``` ## Tool Reference The system provides a set of tools for interacting with GitHub's API. All tools follow consistent patterns for parameters and returns. ### Issue Management Tools 1. **create_issue** - Creates a new issue in a repository - Required parameters: `owner`, `repo`, `title` - Optional parameters: `body`, `assignees`, `labels`, `milestone` - Notes: - Non-existent labels will be created automatically - HTML content in body will be sanitized and rendered as markdown 2. **get_issue** - Retrieves details about a specific issue - Required parameters: `owner`, `repo`, `issue_number` - Returns 404 if issue doesn't exist or for private repositories without access 3. **update_issue** - Updates an existing issue - Required parameters: `owner`, `repo`, `issue_number` - Optional parameters: `title`, `body`, `state`, `labels`, `assignees`, `milestone` - Notes: Only provided fields will be updated 4. **list_issues** - Lists issues in a repository - Required parameters: `owner`, `repo` - Optional parameters: `state`, `labels`, `sort`, `direction`, `since`, `page`, `per_page` - Defaults to open issues if state not provided ### Comment Management Tools 1. **add_issue_comment** - Adds a comment to an issue - Required parameters: `owner`, `repo`, `issue_number`, `body` 2. **list_issue_comments** - Lists comments on an issue - Required parameters: `owner`, `repo`, `issue_number` - Optional parameters: `since`, `page`, `per_page` 3. **update_issue_comment** - Updates an existing comment - Required parameters: `owner`, `repo`, `issue_number`, `comment_id`, `body` 4. **delete_issue_comment** - Deletes a comment from an issue - Required parameters: `owner`, `repo`, `issue_number`, `comment_id` ### Label Management Tools 1. **add_issue_labels** - Adds labels to an issue - Required parameters: `owner`, `repo`, `issue_number`, `labels` - Notes: Non-existent labels are created automatically with default color 2. **remove_issue_label** - Removes a label from an issue - Required parameters: `owner`, `repo`, `issue_number`, `label` - Notes: Label name is case-sensitive ### Content Handling All text content (issue body, comments) supports GitHub Flavored Markdown with features like: - Headers, lists, code blocks, tables - Task lists, mentions, references - Emoji shortcodes GitHub automatically processes HTML content, with most HTML tags removed for security. ## Documentation Patterns ### 1. Function Documentation ```python def operation_name(params: ParamsType) -> ResultType: """Operation description. Args: params: Parameter description Returns: Description of return value Raises: GitHubError: Error conditions """ ``` ### 2. Class Documentation ```python class ClassName: """Class description. Attributes: attr_name: Attribute description Methods: method_name: Method description """ ``` ### 3. Schema Documentation ```python class SchemaModel(BaseModel): """Schema description. Maps to PyGithub ObjectType. See: [link to PyGithub docs] """ ``` ## Validation Patterns ### 1. Field Validation ```python class SchemaModel(BaseModel): model_config = ConfigDict(strict=True) title: str = Field(..., description="Title field", strict=True) @field_validator('title') @classmethod def validate_title(cls, v): """Validate that title is not empty.""" if not v.strip(): raise ValueError("title cannot be empty") return v ``` ### 2. Datetime Validation ```python class DateTimeModel(BaseModel): since: Optional[datetime] = Field( None, description="Filter by date (ISO 8601 format with timezone: YYYY-MM-DDThh:mm:ssZ)" ) @field_validator('since', mode='before') @classmethod def validate_since(cls, v): """Convert string dates to datetime objects. Accepts: - ISO 8601 format strings with timezone (e.g., "2020-01-01T00:00:00Z") - ISO 8601 format strings with timezone without colon (e.g., "2020-01-01T12:30:45-0500") - datetime objects Returns: - datetime object Raises: - ValueError: If the string is not a valid ISO 8601 datetime with timezone """ if isinstance(v, str): # Check for ISO format with time component and timezone if not ('T' in v and ('+' in v or 'Z' in v or '-' in v.split('T')[1])): raise ValueError( f"Invalid ISO format datetime: {v}. " f"Must be in format YYYY-MM-DDThh:mm:ss+00:00 or YYYY-MM-DDThh:mm:ssZ" ) try: # Handle 'Z' timezone indicator by replacing with +00:00 v = v.replace('Z', '+00:00') # Handle timezone formats without colons (e.g., -0500 -> -05:00) # Check if there's a timezone part (+ or - followed by 4 digits) if ('+' in v or '-' in v.split('T')[1]): # Find the position of the timezone sign sign_pos = max(v.rfind('+'), v.rfind('-')) if sign_pos > 0: timezone_part = v[sign_pos:] # If timezone doesn't have a colon and has 5 chars (e.g., -0500) if ':' not in timezone_part and len(timezone_part) == 5: # Insert colon between hours and minutes v = v[:sign_pos+3] + ':' + v[sign_pos+3:] return datetime.fromisoformat(v) except ValueError: raise ValueError( f"Invalid ISO format datetime: {v}. " f"Contains invalid date/time components." ) return v ``` ### 3. Enum Validation ```python class StateModel(BaseModel): state: Optional[str] = Field( None, description=f"Issue state: {', '.join(VALID_STATES)}" ) @field_validator('state') @classmethod def validate_state(cls, v): """Validate that state is one of the allowed values.""" if v is not None and v not in VALID_STATES: raise ValueError(f"Invalid state: {v}. Must be one of: {', '.join(VALID_STATES)}") return v ``` ### 4. Numeric Validation ```python class PaginationModel(BaseModel): page: Optional[int] = Field( None, description="Page number for pagination (1-based)" ) per_page: Optional[int] = Field( None, description="Results per page (max 100)" ) @field_validator('page') @classmethod def validate_page(cls, v): """Validate that page is a positive integer.""" if v is not None and v < 1: raise ValueError("Page number must be a positive integer") return v @field_validator('per_page') @classmethod def validate_per_page(cls, v): """Validate that per_page is a positive integer <= 100.""" if v is not None: if v < 1: raise ValueError("Results per page must be a positive integer") if v > 100: raise ValueError("Results per page cannot exceed 100") return v ``` ### 5. URL Validation ```python class UrlModel(BaseModel): url: str = Field(..., description="URL to a resource") @field_validator('url') @classmethod def validate_url(cls, v): """Validate that URL is properly formatted.""" if not v: raise ValueError("URL cannot be empty") # Simple URL validation to check for protocol and domain if not (v.startswith('http://') or v.startswith('https://')): raise ValueError("URL must start with http:// or https://") # Split URL to extract domain try: parsed = urlparse(v) if not parsed.netloc: raise ValueError("URL must contain a valid domain") except Exception: raise ValueError("Invalid URL format") return v ``` ### 6. List Validation ```python class ListModel(BaseModel): labels: Optional[List[str]] = Field( None, description="List of label names" ) @field_validator('labels') @classmethod def validate_labels(cls, v): """Validate that labels are properly formatted.""" if v is not None: # Check if any label is empty if any(not label.strip() for label in v): raise ValueError("Labels cannot be empty strings") # Check for duplicates if len(v) != len(set(v)): raise ValueError("Labels must be unique") return v ``` ### 7. Validation Testing Patterns When testing validation in schemas: ```python def test_schema_validation(): """Test schema validation rules.""" # Test valid data valid_params = ListIssuesParams(owner="test-owner", repo="test-repo") assert valid_params.owner == "test-owner" # Test invalid data with pytest.raises(ValidationError) as exc_info: ListIssuesParams(owner="", repo="test-repo") assert "owner cannot be empty" in str(exc_info.value).lower() # Test field validators with pytest.raises(ValidationError) as exc_info: ListIssuesParams( owner="test-owner", repo="test-repo", state="invalid-state" ) assert "invalid state" in str(exc_info.value).lower() # Test datetime validation with pytest.raises(ValidationError) as exc_info: ListIssuesParams( owner="test-owner", repo="test-repo", since="2021-01-01" # Missing time component ) assert "invalid iso format datetime" in str(exc_info.value).lower() ``` ### 8. Model Composition with Inheritance Use inheritance to create composable schemas with shared validation: ```python class PaginatedParams(BaseModel): """Base class for paginated parameters.""" page: Optional[int] = Field(None, description="Page number (1-based)") per_page: Optional[int] = Field(None, description="Results per page (max 100)") @field_validator('page') @classmethod def validate_page(cls, v): if v is not None and v < 1: raise ValueError("Page number must be a positive integer") return v @field_validator('per_page') @classmethod def validate_per_page(cls, v): if v is not None: if v < 1: raise ValueError("Results per page must be a positive integer") if v > 100: raise ValueError("Results per page cannot exceed 100") return v class RepositoryParams(BaseModel): """Base class for repository-related parameters.""" owner: str = Field(..., description="Repository owner") repo: str = Field(..., description="Repository name") @field_validator('owner', 'repo') @classmethod def validate_not_empty(cls, v, info): if not v.strip(): raise ValueError(f"{info.field_name} cannot be empty") return v class ListIssuesParams(RepositoryParams, PaginatedParams): """Parameters for listing issues in a repository.""" state: Optional[str] = Field(None, description="Issue state: open, closed, all") sort: Optional[str] = Field(None, description="Sort field: created, updated, comments") direction: Optional[str] = Field(None, description="Sort direction: asc, desc") since: Optional[datetime] = Field(None, description="Only issues updated at or after this time") labels: Optional[List[str]] = Field(None, description="Filter by label names") @field_validator('state') @classmethod def validate_state(cls, v): if v is not None and v not in ["open", "closed", "all"]: raise ValueError("Invalid state: must be one of: open, closed, all") return v ``` This inheritance model allows for reusing validation logic across multiple parameter types while keeping the code DRY.