Skip to main content
Glama
biocontext-ai

BioContextAI Knowledgebase MCP

Official

bc_search_google_scholar_publications

Search Google Scholar for academic publications using queries or author names to retrieve titles, authors, abstracts, and citation data for research purposes.

Instructions

Search Google Scholar for publications with support for author search using 'author:"Name"' syntax. WARNING: Use responsibly, may block excessive queries.

Returns: dict: Publications list with title, authors, venue, year, citations, abstract, bib entry or error message.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesSearch query (e.g., 'machine learning' or 'author:"John Smith" deep learning')
max_resultsNoMaximum number of publications to return (1-50)
use_proxyNoUse free proxies to avoid rate limiting

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • Core handler function for the 'search_google_scholar_publications' tool (likely the target 'bc_search_google_scholar_publications'), decorated with @core_mcp.tool(). Implements search logic using scholarly library, proxy support, and result parsing.
    @core_mcp.tool()
    def search_google_scholar_publications(
        query: Annotated[
            str,
            Field(description="Search query (e.g., 'machine learning' or 'author:\"John Smith\" deep learning')"),
        ],
        max_results: Annotated[int, Field(description="Maximum number of publications to return (1-50)", ge=1, le=50)] = 10,
        use_proxy: Annotated[bool, Field(description="Use free proxies to avoid rate limiting")] = True,
    ) -> Dict[str, Any]:
        """Search Google Scholar for publications with support for author search using 'author:"Name"' syntax. WARNING: Use responsibly, may block excessive queries.
    
        Returns:
            dict: Publications list with title, authors, venue, year, citations, abstract, bib entry or error message.
        """
        try:
            # Set up proxy if requested
            if use_proxy:
                try:
                    pg = ProxyGenerator()
                    pg.FreeProxies()
                    scholarly.use_proxy(pg)
                    logger.info("Proxy configured for Google Scholar requests")
                except Exception as e:
                    logger.warning(f"Failed to set up proxy: {e}")
                    # Continue without proxy
    
            # Search for publications
            search_query = scholarly.search_pubs(query)
    
            publications = []
    
            for count, pub in enumerate(search_query):
                if count >= max_results:
                    break
    
                # Extract publication information
                bib = pub.get("bib", {})
                pub_info = {
                    "title": bib.get("title", ""),
                    "author": bib.get("author", ""),
                    "venue": bib.get("venue", ""),
                    "pub_year": bib.get("pub_year", ""),
                    "abstract": bib.get("abstract", ""),
                    "pub_url": bib.get("pub_url", ""),
                    "eprint_url": pub.get("eprint_url", ""),
                    "num_citations": pub.get("num_citations", 0),
                    "citedby_url": pub.get("citedby_url", ""),
                    "url_scholarbib": pub.get("url_scholarbib", ""),
                }
    
                publications.append(pub_info)
    
            return {"query": query, "total_found": len(publications), "publications": publications}
    
        except Exception as e:
            logger.error(f"Error searching Google Scholar publications: {e}")
            return {
                "error": f"Failed to search Google Scholar publications: {e!s}",
                "note": "Google Scholar may be blocking requests. Publication searches are particularly risky. Try again later or use alternative databases like PubMed/EuropePMC.",
            }
  • Conditional import that registers the scholarly tools, including search_google_scholar_publications, by importing the module containing the @core_mcp.tool()-decorated function.
    if os.getenv("MCP_ENVIRONMENT") != "PRODUCTION" or os.getenv("MCP_INCLUDE_SCHOLARLY", "false").lower() == "true":
        from .scholarly import *
  • Pydantic schema definition via Annotated Fields for input parameters: query (str), max_results (int 1-50), use_proxy (bool). Output is Dict[str, Any].
    def search_google_scholar_publications(
        query: Annotated[
            str,
            Field(description="Search query (e.g., 'machine learning' or 'author:\"John Smith\" deep learning')"),
        ],
        max_results: Annotated[int, Field(description="Maximum number of publications to return (1-50)", ge=1, le=50)] = 10,
        use_proxy: Annotated[bool, Field(description="Use free proxies to avoid rate limiting")] = True,
    ) -> Dict[str, Any]:
  • Module __init__.py re-exports the tool function, facilitating its import and registration when the scholarly module is imported.
    from ._search_publications import search_google_scholar_publications
    
    __all__ = [
        "search_google_scholar_publications",
    ]
  • Definition of core_mcp FastMCP server instance with 'BC' prefix, used for tool registration via decorators.
    core_mcp = FastMCP(  # type: ignore
        "BC",
        instructions="Provides access to biomedical knowledge bases.",
    )
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses a key behavioral trait: the risk of blocking due to excessive queries, which is crucial for a search tool. However, it lacks details on other behaviors such as authentication needs, pagination, error handling beyond 'error message,' or how the proxy parameter affects performance. The description adds some value but is incomplete for behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: the first sentence states the core purpose, followed by a warning and return format. It avoids unnecessary fluff, but the return format could be more concise (e.g., listing fields like 'title, authors...' might be redundant if covered in output schema, though no output schema is confirmed here). Overall, it's efficient with minor room for improvement.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (search with rate limiting and proxy options), no annotations, and unknown output schema status (context signals indicate 'Has output schema: true', but it's not provided here), the description is moderately complete. It covers purpose, a key risk, and return fields, but lacks details on error conditions, performance implications of the proxy, or how results are structured. It's adequate but has clear gaps for a tool with behavioral nuances.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters (query, max_results, use_proxy) thoroughly. The description adds no additional parameter semantics beyond what's in the schema (e.g., it mentions author search syntax, but this is also covered in the schema's query description). Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Search Google Scholar for publications with support for author search using "author:"Name"' syntax.' It specifies the verb ('Search'), resource ('Google Scholar for publications'), and a key feature (author search syntax). However, it doesn't explicitly differentiate from sibling tools, which are all biomedical research tools but none appear to be Google Scholar-specific searches.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides minimal usage guidance. It includes a WARNING about rate limiting ('Use responsibly, may block excessive queries'), which hints at when to be cautious, but offers no explicit guidance on when to use this tool versus alternatives (e.g., other search tools in the sibling list like 'bc_search_drugs_fda' or 'bc_search_studies'), nor does it mention prerequisites or ideal scenarios for Google Scholar searches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/biocontext-ai/knowledgebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server