mcp-github-advanced-search
Allows advanced GitHub code search with filtering and content retrieval, enabling LLMs to search repositories by keywords, file names, and retrieve file contents.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mcp-github-advanced-searchsearch for files named clinerules"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Server for GitHub Advanced Search (G.A.S.)
A powerful Model Context Protocol (MCP) server that enables Large Language Models to perform advanced GitHub code searches with intelligent filtering and content retrieval capabilities, optimized for DeepSeek integration.
🔍 Overview
The GitHub Advanced Search (G.A.S.) MCP server provides LLMs with sophisticated GitHub search capabilities that go beyond standard API limitations. Using web automation with Playwright, it enables deep code discovery, pattern analysis, and content retrieval across the entire GitHub ecosystem. This version includes enhanced support for DeepSeek models, providing tailored search results and structured JSON output.
Demo
tested using vscode + cline + openrouter:deepseek/deepseek-r1-0528:free
Example1
# step1: init gas
gas_entrypoint
# step2: feed model
gas_search_code
file_name: clinerules
# step3: make your wish
You are now have better knowledge of `clinerules`
please keep the current file format and deep level
enhance the `<file-path-to-clinerules>`Key Features
🔍 Advanced GitHub Search: Search by keywords, file names, and complex filters
📁 Content Retrieval: Automatically fetch and return file contents
🤖 LLM Integration: Seamless integration with Claude, GPT, and other MCP-compatible LLMs
🔄 Pagination Support: Handle large result sets with intelligent pagination
🌐 Web Automation: Uses Playwright for robust GitHub interaction
📊 Structured Results: Returns organized JSON data with repository links, file links, and content
⚡ High Performance: Async operations with concurrent file downloads
🔐 Authentication Support: Works with GitHub login for private repositories
🤖 DeepSeek Integration: Optimized for use with DeepSeek models, providing tailored search results and structured JSON output
🏗️ Architecture
graph TB
subgraph "MCP Client (LLM)"
A[Claude/GPT/Other LLM]
end
subgraph "MCP Server (G.A.S.)"
B[MCP Server]
C[Search Engine]
D[Playwright Browser]
E[Content Fetcher]
F[Result Processor]
end
subgraph "GitHub"
G[GitHub Search]
H[Repository Files]
I[Raw Content]
end
A -->|MCP Protocol| B
B --> C
C --> D
D -->|Web Automation| G
G -->|Search Results| D
D --> E
E -->|HTTP Requests| I
I -->|File Content| E
E --> F
F -->|Structured Data| B
B -->|JSON Response| A
classDef client fill:#e1f5fe,stroke:#01579b,color:#01579b
classDef server fill:#e8f5e9,stroke:#2e7d32,color:#1b5e20
classDef github fill:#f3e5f5,stroke:#4a148c,color:#4a148c
class A client
class B,C,D,E,F server
class G,H,I github🔄 Search Workflow
sequenceDiagram
participant LLM as LLM Client
participant MCP as MCP Server
participant PW as Playwright Browser
participant GH as GitHub Search
participant API as GitHub Raw API
LLM->>MCP: gas_search_code(keyword, file_name)
MCP->>PW: Launch browser session
PW->>GH: Navigate to search URL
GH-->>PW: Search results page
PW->>PW: Extract repository & file links
loop For each page
PW->>GH: Navigate to page N
GH-->>PW: Results for page N
PW->>PW: Extract links from page
end
MCP->>API: Fetch file contents (async)
API-->>MCP: Raw file content
MCP->>MCP: Structure response data
MCP-->>LLM: JSON with repositories, files & content
alt More results available
LLM->>MCP: get_remaining_result(start_id)
MCP-->>LLM: Next batch of results
end🚀 Quick Start
Prerequisites
Python 3.10 or higher
Node.js (for Playwright browser automation)
GitHub account (recommended for optimal functionality)
Installation
Install the package:
pip install mcp-server-git-gasInstall Playwright browsers:
playwright install chromium💀Not tested Configure your MCP client (e.g. claude desktop):
Add to your
claude_desktop_config.json:# not tested !!! { "mcpServers": { "github-advanced-search": { "command": "mcp-server-git-gas", "args": [] } } }
First Search
Once configured, you can start searching GitHub through your LLM:
Search GitHub for Python files containing "async def" functionsThe LLM will automatically use the G.A.S. tools to perform the search and return structured results.
📦 Installation Options from Source
step1 (clone source)
$ cd ~
$ git clone --depth=1 https://github.com/louiscklaw/mcp-github-advanced-search ~/mcp/mcp-git-gasstep2 (install remaining dependencies, playwright)
# Install Playwright browsers
$ playwright install chrome
$ playwright install --depsstep3 seed chrome user credentials
# this will create the user_data_dir for chromium
# go login google or any other service you want
$ cd ~/mcp/mcp-git-gas
$ ./seedChromeUserDataDir.sh⚙️ Configuration
VS Code with MCP Extension
{
"mcp": {
"servers": {
"git-gas": {
"autoApprove": [
"get_remaining_result",
"gas_readme",
"gas_search_code"
],
"disabled": false,
"timeout": 300,
"type": "stdio",
"command": "uv",
"args": [
"--directory",
"<USER_HOME_DIR>/mcp/mcp-git-gas/src/mcp_server_git_gas",
"run",
"mcp-server-git-gas"
]
}
}
}
}🛠️ Available Tools
gas_entrypoint
Initialize and get information about the GitHub Advanced Search server.
Parameters: None
Returns: Server information and usage instructions with workflow diagram.
graph TD
a((start))
d((end))
b("search code with filter (gas_search_code)")
c("return search result")
c1("is the result finished ?")
c2("use get_remaining_result to list remaining result")
a --> b --> c --> c1 -- Yes --> d
c1 -- No --> c2
c2 --> c1gas_search_code
Search GitHub repositories with advanced filters.
Parameters:
keyword(string, optional): Search keyword (single word recommended)file_name(string, optional): Specific filename to search for (e.g., ".clinerules", "README.md")
Returns: Array of search results with:
[
{
"REPOSITORY_LINK": "https://github.com/owner/repo",
"FILE_LINK": "https://github.com/owner/repo/blob/main/file.py",
"RAW_UESR_CONTENT_LINK": "https://raw.githubusercontent.com/owner/repo/main/file.py",
"FILE_CONTENT": "actual file content..."
}
]get_remaining_result
Retrieve additional results from a previous search (pagination).
Parameters:
start_id(integer): Starting index for the next batch of results
Returns: Next batch of search results with the same structure as gas_search_code.
💡 Usage Examples
step1
call `gas_entrypoint` to initialize yourselfstep2
Hi,
please use `gas_search_code` with below json
{
"keyword": "mcp mermaid",
"file_name": "README.md"
}
and understand the content returned, i will send you the task afterwards.
step3
i am working on a python project,
please task a look to the source code of the project.
with the help from files in former results.
please help to and update the README file.
🔧 Configuration & Environment
Browser Configuration
The server uses Playwright with persistent browser context for:
Session management
Authentication state preservation
Improved performance
Browser data is stored in: ~/mcp/mcp-git-gas/_user_data_dir
🏃♂️ Development
Local Development Setup
# Clone the repository
git clone <repository-url>
cd mcp-server-git-gas
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts/activate
# Install dependencies
pip install -e .
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytestProject Structure
src/mcp_server_git_gas/
├── __init__.py # CLI entry point
├── server.py # Main MCP server implementation
├── CONST.py # Configuration constants
├── fetch_data.py # Async HTTP client
├── fetchFileContent.py # File content retrieval
├── convertFileLinkToRaw... # URL conversion utilities
├── url_util.py # URL building helpers
└── git_dump_screen.py # Debug utilitiesKey Components
MCP Server: Implements the Model Context Protocol interface
Search Engine: Handles GitHub search logic and pagination
Content Fetcher: Retrieves file contents asynchronously
Browser Automation: Playwright-based GitHub interaction
Testing with MCP Inspector
# Test the server with MCP inspector
npx @modelcontextprotocol/inspector uvx mcp-server-git-gasDocker Development
# Build development image
docker build -t mcp/git-gas:dev .
# Run with volume mount for development
docker run --rm -i \
-v $(pwd):/app \
mcp/git-gas:dev🔍 How It Works
Search Initiation: LLM calls
gas_search_codewith search parametersQuery Building: Server constructs GitHub search URL with filters
Web Automation: Playwright navigates GitHub search pages
Result Extraction: JavaScript execution extracts repository and file links
Content Retrieval: Parallel HTTP requests fetch file contents
Response Formatting: Results structured as JSON for LLM consumption
🐛 Troubleshooting
Common Issues
"Not logged in" errors
Solution: run
seedChromeUserDataDir.shto start a browser and perform login
No results found
Check search
keywordsfor typosTry broader search criteria
Verify GitHub is accessible
Browser launch failures
Run:
playwright install chromiumCheck system requirements for Playwright
Rate limiting
GitHub may rate limit requests
The server includes delays and retry logic
Consider using authenticated sessions for higher limits
Debug Mode
Debug screenshots are saved to: ~/mcp_github_advanced_search/debug.png
📊 Performance
Search Speed: ~2-5 seconds per search page
Concurrent Requests: Up to 10 parallel file downloads
Result Limits: 20 results per search (configurable)
Pagination: Supports up to 2 pages (100+ results)
🚨 Important Notes
GitHub Authentication: Login to GitHub in the browser for optimal results
Rate Limiting: Respects GitHub's usage policies
Browser Requirements: Requires Chromium browser (installed via Playwright)
Network Dependencies: Requires internet connection for GitHub access
🤝 Contributing
We welcome contributions! Please see our Contributing Guidelines for details.
Development Workflow
Fork the repository
Create a feature branch (
git checkout -b feature/amazing-feature)Make your changes
Add tests for new functionality
Run the test suite
Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
Built on the Model Context Protocol by Anthropic
Uses Playwright for browser automation
Inspired by the need for advanced GitHub search capabilities in LLM workflows
Thanks to the MCP community for feedback and contributions
🔗 Related Projects
📞 Support
Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Wiki
Note: This server requires a GitHub account for optimal functionality. Some features may be limited when used without authentication.
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/louiscklaw/mcp-github-advanced-search'
If you have feedback or need assistance with the MCP directory API, please join our Discord server