bc_search_google_scholar_publications
Retrieve publication results from Google Scholar with advanced search options, including author-specific queries, while mitigating IP blocking through proxy use.
Instructions
Search for publications on Google Scholar.
Supports advanced search operators including author search using 'author:"Name"' syntax.
Examples:
'machine learning' - General topic search
'author:"John Smith"' - Publications by specific author
'author:"John Smith" neural networks' - Author's work on specific topic
WARNING: Google Scholar may block requests and IP addresses for excessive queries. Publication searches are particularly prone to triggering anti-bot measures. This tool automatically uses free proxies to mitigate blocking, but use responsibly.
For academic research, consider using alternative databases like PubMed/EuropePMC when possible to reduce load on Google Scholar.
Args: query (str): Search query for publications. Use 'author:"Name"' to search by author. max_results (int): Maximum number of publications to return (default: 10, max: 50). use_proxy (bool): Whether to use free proxies to avoid rate limiting (default: True).
Returns: dict: Publication search results or error message
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| max_results | No | Maximum number of publications to return | |
| query | Yes | Search query for publications (e.g., 'machine learning' or 'author:"John Smith" deep learning') | |
| use_proxy | No | Whether to use free proxies to avoid rate limiting |
Implementation Reference
- The handler function search_google_scholar_publications decorated with @core_mcp.tool(). Implements the logic to search Google Scholar publications using the scholarly library, supports proxy to avoid blocking, parses results into structured dict with title, authors, venue, year, abstract, URLs, citations, and handles errors with informative messages.@core_mcp.tool() def search_google_scholar_publications( query: Annotated[ str, Field(description="Search query (e.g., 'machine learning' or 'author:\"John Smith\" deep learning')"), ], max_results: Annotated[int, Field(description="Maximum number of publications to return (1-50)", ge=1, le=50)] = 10, use_proxy: Annotated[bool, Field(description="Use free proxies to avoid rate limiting")] = True, ) -> Dict[str, Any]: """Search Google Scholar for publications with support for author search using 'author:"Name"' syntax. WARNING: Use responsibly, may block excessive queries. Returns: dict: Publications list with title, authors, venue, year, citations, abstract, bib entry or error message. """ try: # Set up proxy if requested if use_proxy: try: pg = ProxyGenerator() pg.FreeProxies() scholarly.use_proxy(pg) logger.info("Proxy configured for Google Scholar requests") except Exception as e: logger.warning(f"Failed to set up proxy: {e}") # Continue without proxy # Search for publications search_query = scholarly.search_pubs(query) publications = [] for count, pub in enumerate(search_query): if count >= max_results: break # Extract publication information bib = pub.get("bib", {}) pub_info = { "title": bib.get("title", ""), "author": bib.get("author", ""), "venue": bib.get("venue", ""), "pub_year": bib.get("pub_year", ""), "abstract": bib.get("abstract", ""), "pub_url": bib.get("pub_url", ""), "eprint_url": pub.get("eprint_url", ""), "num_citations": pub.get("num_citations", 0), "citedby_url": pub.get("citedby_url", ""), "url_scholarbib": pub.get("url_scholarbib", ""), } publications.append(pub_info) return {"query": query, "total_found": len(publications), "publications": publications} except Exception as e: logger.error(f"Error searching Google Scholar publications: {e}") return { "error": f"Failed to search Google Scholar publications: {e!s}", "note": "Google Scholar may be blocking requests. Publication searches are particularly risky. Try again later or use alternative databases like PubMed/EuropePMC.", }
- src/biocontext_kb/core/scholarly/__init__.py:9-13 (registration)Imports the search_google_scholar_publications function from ._search_publications and exposes it via __all__, enabling module-level access.from ._search_publications import search_google_scholar_publications __all__ = [ "search_google_scholar_publications", ]
- src/biocontext_kb/core/_server.py:1-7 (registration)Creates the core_mcp FastMCP server instance named 'BC' to which tools like search_google_scholar_publications are registered via @core_mcp.tool() decorators.from fastmcp import FastMCP core_mcp = FastMCP( # type: ignore "BC", instructions="Provides access to biomedical knowledge bases.", )
- src/biocontext_kb/app.py:35-40 (registration)Imports the core_mcp server (containing the tool) into the main BioContextAI mcp_app under the slugified prefix 'bc', finalizing the tool name as 'bc_search_google_scholar_publications'.for mcp in [core_mcp, *(await get_openapi_mcps())]: await mcp_app.import_server( mcp, slugify(mcp.name), ) logger.info("MCP server setup complete.")