semgrep_scan
Run static code analysis on provided files using Semgrep to detect vulnerabilities and return findings in JSON format for detailed inspection and remediation.
Instructions
Runs a Semgrep scan on provided code content and returns the findings in JSON format
Args: code_files: List of dictionaries with 'filename' and 'content' keys config: Semgrep configuration (e.g. "auto" or absolute path to rule file)
Returns: Dictionary with scan results in Semgrep JSON format
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| code_files | Yes | ||
| config | No | auto |
Implementation Reference
- src/semgrep_mcp/models.py:22-32 (schema)Pydantic model defining the output schema for semgrep_scan tool results, including version, findings, errors, paths, and skipped rules.class SemgrepScanResult(BaseModel): version: str = Field(description="Version of Semgrep used for the scan") results: list[dict[str, Any]] = Field(description="List of semgrep scan results") errors: list[dict[str, Any]] = Field( description="List of errors encountered during scan", default_factory=list ) paths: dict[str, Any] = Field(description="Paths of the scanned files") skipped_rules: list[str] = Field( description="List of rules that were skipped during scan", default_factory=list )
- src/semgrep_mcp/models.py:7-14 (schema)Pydantic model for input code files used by semgrep_scan, containing path and content.class CodeFile(BaseModel): # This "path" is mostly for bookkeeping purposes. # Depending on whether the server is hosted or not, this path might # not actually exist on the filesystem. path: str = Field(description="Path of the code file") # The `content` field will be filled in either by the LLM (in the remote scanning case) # or gleaned from the filesystem (in the local scanning case). content: str = Field(description="Content of the code file")
- src/semgrep_mcp/server.py:144-194 (helper)Core helper function that creates a temporary directory (prefixed 'semgrep_scan_') and writes code files to it for Semgrep to scan.def create_temp_files_from_code_content(code_files: list[CodeFile]) -> str: """ Creates temporary files from code content Args: code_files: List of CodeFile objects Returns: Path to temporary directory containing the files Raises: McpError: If there are issues creating or writing to files """ temp_dir = None try: # Create a temporary directory temp_dir = tempfile.mkdtemp(prefix="semgrep_scan_") # Create files in the temporary directory for file_info in code_files: filename = file_info.path if not filename: continue temp_file_path = safe_join(temp_dir, filename) try: # Create subdirectories if needed os.makedirs(os.path.dirname(temp_file_path), exist_ok=True) # Write content to file with open(temp_file_path, "w") as f: f.write(file_info.content) except OSError as e: raise McpError( ErrorData( code=INTERNAL_ERROR, message=f"Failed to create or write to file {filename}: {e!s}", ) ) from e return temp_dir except Exception as e: if temp_dir: # Clean up temp directory if creation failed shutil.rmtree(temp_dir, ignore_errors=True) raise McpError( ErrorData(code=INTERNAL_ERROR, message=f"Failed to create temporary files: {e!s}") ) from e
- src/semgrep_mcp/server.py:196-216 (helper)Helper to construct Semgrep CLI arguments for scanning the temporary directory with optional config.def get_semgrep_scan_args(temp_dir: str, config: str | None = None) -> list[str]: """ Builds command arguments for semgrep scan Args: temp_dir: Path to temporary directory containing the files config: Optional Semgrep configuration (e.g. "auto" or absolute path to rule file) Returns: List of command arguments """ # Build command arguments and just run semgrep scan # if no config is provided to allow for either the default "auto" # or whatever the logged in config is args = ["scan", "--json", "--experimental"] # avoid the extra exec if config: args.extend(["--config", config]) args.append(temp_dir) return args
- src/semgrep_mcp/server.py:288-317 (helper)Post-processing helper that removes the temporary directory prefix from paths in the scan results.def remove_temp_dir_from_results(results: SemgrepScanResult, temp_dir: str) -> None: """ Clean the results from semgrep by converting temporary file paths back to original relative paths Args: results: SemgrepScanResult object containing semgrep results temp_dir: Path to temporary directory used for scanning """ # Process findings results for finding in results.results: if "path" in finding: try: finding["path"] = os.path.relpath(finding["path"], temp_dir) except ValueError: # Skip if path is not relative to temp_dir continue # Process scanned paths if "scanned" in results.paths: results.paths["scanned"] = [ os.path.relpath(path, temp_dir) for path in results.paths["scanned"] ] if "skipped" in results.paths: results.paths["skipped"] = [ os.path.relpath(path, temp_dir) for path in results.paths["skipped"] ]