Google Scholar MCP Server

README.md•7.92 KiB

# 🔬 Google Scholar MCP Server A Model Context Protocol (MCP) server that provides access to Google Scholar for academic research through web scraping. This server enables you to search for papers, find author publications, discover recent research, and identify highly cited works. [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/) ## ✨ Features - **🔍 Paper Search**: Search Google Scholar for academic papers with flexible filtering - **👨‍🔬 Author Research**: Find papers by specific authors - **📅 Recent Papers**: Discover recent publications in any field - **🏆 Highly Cited Papers**: Find influential papers with citation filtering - **⏱️ Rate Limiting**: Respectful scraping with built-in delays - **🛡️ Error Handling**: Robust error handling and logging - **🌐 Local Web Interface**: Optional Flask web interface for testing - **🧠 Smart Query Processing**: Natural language query processing with AI integration ## 🚀 Quick Start ### Prerequisites - Python 3.8 or higher - pip (Python package manager) ### Installation 1. **Clone the repository** ```bash git clone https://github.com/yourusername/google-scholar-mcp.git cd google-scholar-mcp ``` 2. **Install dependencies** ```bash pip install -r requirements.txt ``` 3. **Optional: Set up environment variables** ```bash cp env.example .env # Edit .env with your preferred settings ``` ### Running the MCP Server Run the MCP server for use with MCP clients: ```bash python main.py ``` ### Testing with Local Web Interface For testing and development, you can run the local web interface: ```bash python local_server.py ``` Then open your browser to `http://localhost:5000` ## 🔧 Configuration The server can be configured through environment variables. Copy `env.example` to `.env` and modify as needed: ```bash # Request delay between Google Scholar requests (seconds) REQUEST_DELAY=5 # Maximum results per request MAX_RESULTS_PER_REQUEST=20 # HTTP timeout (seconds) TIMEOUT=15 ``` ## Available Tools ### 1. search_papers Search for academic papers on Google Scholar. **Parameters:** - `query` (required): Search query for papers - `num_results` (optional): Number of results to return (1-20, default: 10) - `start_year` (optional): Earliest publication year to include - `end_year` (optional): Latest publication year to include **Example:** ```json { "query": "machine learning neural networks", "num_results": 15, "start_year": 2020, "end_year": 2024 } ``` ### 2. get_author_papers Search for papers by a specific author. **Parameters:** - `author_name` (required): Name of the author to search for - `num_results` (optional): Number of results to return (default: 10) **Example:** ```json { "author_name": "Geoffrey Hinton", "num_results": 20 } ``` ### 3. search_recent_papers Search for recent papers in a specific field. **Parameters:** - `field` (required): Research field or topic - `years_back` (optional): How many years back to search (1-10, default: 2) - `num_results` (optional): Number of results to return (default: 10) **Example:** ```json { "field": "quantum computing", "years_back": 3, "num_results": 15 } ``` ### 4. get_highly_cited_papers Search for highly cited papers in a topic. **Parameters:** - `topic` (required): Research topic or field - `min_citations` (optional): Minimum number of citations (default: 100) - `num_results` (optional): Number of results to return (default: 10) **Example:** ```json { "topic": "transformer neural networks", "min_citations": 500, "num_results": 10 } ``` ## Response Format Each tool returns a JSON response with paper information including: - `title`: Paper title - `authors`: Author names - `url`: Link to the paper - `year`: Publication year - `snippet`: Paper abstract/description snippet - `cited_by`: Number of citations (when available) - `pdf_url`: Direct PDF link (when available) - `publication_info`: Journal/conference information ## Rate Limiting and Ethics This server implements respectful scraping practices: - 2-second delays between requests - Proper User-Agent headers - Error handling for rate limits - Designed for research and educational purposes ## 🔍 MCP Client Integration ### Claude Desktop Add this to your Claude Desktop configuration (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS): ```json { "mcpServers": { "google-scholar": { "command": "python", "args": ["/path/to/google-scholar-mcp/main.py"], "cwd": "/path/to/google-scholar-mcp" } } } ``` ### Other MCP Clients The server follows the standard MCP protocol and should work with any MCP-compatible client. ## 🧠 Smart Query Processing The server includes intelligent query processing that can understand natural language requests: ```python # Example natural language queries: "Find recent computer vision papers from CVPR 2023" "Show me highly cited papers by Geoffrey Hinton" "What are the latest developments in quantum computing?" ``` ## 📊 Response Format All tools return structured JSON with paper information: ```json { "title": "Paper Title", "authors": "Author Names", "url": "Link to paper", "year": 2023, "snippet": "Abstract excerpt...", "cited_by": 150, "pdf_url": "Direct PDF link", "publication_info": "Journal/Conference" } ``` ## ⚖️ Legal and Ethical Considerations - 🎓 **Educational Use**: This tool is intended for research and educational purposes - 📜 **Terms of Service**: Respect Google Scholar's terms of service - 🤝 **Responsible Use**: Use responsibly and avoid excessive requests - 🔌 **Official APIs**: Consider using official APIs when available - 📚 **Copyright**: Be mindful of copyright and fair use policies ## 🔧 Troubleshooting ### Common Issues 1. **Rate Limiting**: If you get blocked, wait and reduce request frequency 2. **Network Errors**: Check your internet connection 3. **Parsing Errors**: Google Scholar may change their HTML structure 4. **Import Errors**: Make sure all dependencies are installed ### Debug Mode Enable debug logging by setting `DEBUG=true` in your `.env` file. ### Logging The server includes detailed logging. Check the console output for error messages and debugging information. ## 📦 Dependencies - `mcp`: Model Context Protocol library - `requests`: HTTP library for web scraping - `beautifulsoup4`: HTML parsing - `lxml`: XML/HTML parser - `urllib3`: HTTP client - `flask`: Web interface (optional) ## 🤝 Contributing Contributions are welcome! Please ensure: 1. **Respectful scraping practices** 2. **Error handling for edge cases** 3. **Clear documentation** 4. **Testing with various queries** 5. **Follow the existing code style** ### Development Setup ```bash # Clone the repository git clone https://github.com/yourusername/google-scholar-mcp.git cd google-scholar-mcp # Install dependencies pip install -r requirements.txt # Run tests python test_server.py python test_query_processor.py # Run local development server python local_server.py ``` ## 📄 License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## 🙏 Acknowledgments - Built on the [Model Context Protocol](https://github.com/anthropics/mcp) by Anthropic - Inspired by the need for accessible academic research tools - Thanks to the open-source community for the excellent libraries used ## ⚠️ Disclaimer This tool is for educational and research purposes. Please respect Google Scholar's terms of service and use responsibly. The authors are not responsible for any misuse of this tool.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/docxmxm/Google-Scholar-MCP-Functional'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•7.92 KiB