Skip to main content
Glama
PhialsBasement

MCP Web Research Server

search_scholar

Search Google Scholar for peer-reviewed research and academic citations. Retrieve structured data including titles, authors, and citation counts.

Instructions

Searches Google Scholar for academic papers and scholarly articles. Use this tool when researching scientific topics, looking for peer-reviewed research, academic citations, or scholarly literature. Returns structured data including titles, authors, publication details, and citation counts. Ideal for academic research and evidence-based inquiries.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesAcademic search query

Implementation Reference

  • Tool definition/schema for search_scholar, defining its name, description, and input schema requiring a 'query' string parameter.
        name: "search_scholar",
        description: "Searches Google Scholar for academic papers and scholarly articles. Use this tool when researching scientific topics, looking for peer-reviewed research, academic citations, or scholarly literature. Returns structured data including titles, authors, publication details, and citation counts. Ideal for academic research and evidence-based inquiries.",
        inputSchema: {
            type: "object",
            properties: {
                query: { type: "string", description: "Academic search query" },
            },
            required: ["query"],
        },
    },
  • index.ts:535-537 (registration)
    Registration of search_scholar via the TOOLS array, returned by the ListTools request handler.
    server.setRequestHandler(ListToolsRequestSchema, async () => ({
        tools: TOOLS  // Return list of available research tools
    }));
  • Handler implementation for search_scholar. Navigates to scholar.google.com, types the query with human-like delays, extracts academic results (title, url, authorInfo, snippet, citationCount) using Google Scholar's .gs_r.gs_or.gs_scl selectors, stores results in the session, and returns formatted JSON.
    case "search_scholar": {
        // Extract search query from request parameters
        const { query } = request.params.arguments as { query: string };
    
        try {
            // Execute search with retry mechanism
            const results = await withRetry(async () => {
                // Step 1: Navigate to Google Scholar search page
                await safePageNavigation(page, 'https://scholar.google.com');
    
                // Simulate human behavior
                await randomDelay(800, 1500);
                await simulateHumanBehavior(page);
    
                // Step 2: Find and interact with search input
                await withRetry(async () => {
                    // Wait for search input element to appear
                    await page.waitForSelector('input[name="q"]', { timeout: 5000 })
                    .catch(() => {
                        throw new Error('Scholar search input not found');
                    });
    
                    // Random delay before interacting
                    await randomDelay(300, 700);
    
                    // Find the search input element
                    const searchInput = await page.$('input[name="q"]');
    
                    // Verify search input was found
                    if (!searchInput) {
                        throw new Error('Scholar search input element not found after waiting');
                    }
    
                    // Step 3: Enter search query with human-like typing
                    await searchInput.click();
                    await randomDelay(100, 300);
                    await searchInput.click({ clickCount: 3 });  // Select all existing text
                    await randomDelay(50, 150);
                    await searchInput.press('Backspace');        // Clear selected text
                    await randomDelay(200, 400);
    
                    // Type query with random delays between characters
                    for (const char of query) {
                        await searchInput.type(char);
                        await randomDelay(50, 150);
                    }
                }, 3, 2000);  // Allow 3 retries with 2s delay
    
                // Random delay before submitting
                await randomDelay(400, 900);
    
                // Step 4: Submit search and wait for results
                await withRetry(async () => {
                    await Promise.all([
                        page.keyboard.press('Enter'),
                                      page.waitForLoadState('networkidle', { timeout: 15000 }),
                    ]);
                });
    
                // Simulate human behavior after results load
                await randomDelay(500, 1000);
                await simulateHumanBehavior(page);
    
                // Step 5: Extract scholar search results
                const scholarResults = await withRetry(async () => {
                    const results = await page.evaluate(() => {
                        // Find all scholar result containers
                        const elements = document.querySelectorAll('.gs_r.gs_or.gs_scl');
                        if (!elements || elements.length === 0) {
                            throw new Error('No scholar search results found');
                        }
    
                        // Extract data from each result
                        return Array.from(elements).map((el) => {
                            try {
                                // Find required elements within result container
                                const titleEl = el.querySelector('.gs_rt');                // Title element
                                const authorEl = el.querySelector('.gs_a');                // Authors, venue, year
                                const snippetEl = el.querySelector('.gs_rs');              // Snippet/abstract
                                const citedByEl = el.querySelector('.gs_fl a:nth-child(3)'); // Cited by element
    
                                // Extract title and URL
                                let title = '';
                                let url = '';
                        if (titleEl) {
                            const titleLink = titleEl.querySelector('a');
                            title = titleEl.textContent?.trim() || '';
                            url = titleLink?.getAttribute('href') || '';
                        }
    
                        // Extract author, venue, and year information
                        const authorInfo = authorEl?.textContent?.trim() || '';
    
                        // Extract snippet
                        const snippet = snippetEl?.textContent?.trim() || '';
    
                        // Extract citation count
                        let citationCount = '';
                        if (citedByEl && citedByEl.textContent?.includes('Cited by')) {
                            citationCount = citedByEl.textContent.trim();
                        }
    
                        // Skip results missing critical data
                        if (!title) {
                            return null;
                        }
    
                        // Return structured result data
                        return {
                            title,                 // Paper title
                            url,                   // Paper URL if available
                            authorInfo,            // Authors, venue, year
                            snippet,               // Abstract/snippet
                            citationCount,         // Citation information
                        };
                            } catch (err) {
                                // Skip problematic results
                                return null;
                            }
                        }).filter(result => result !== null);  // Remove invalid results
                    });
    
                    // Verify we found valid results
                    if (!results || results.length === 0) {
                        throw new Error('No valid scholar search results found');
                    }
    
                    // Return compiled list of results
                    return results;
                });
    
                // Step 6: Store results in session
                scholarResults.forEach((result) => {
                    addResult({
                        url: result.url || 'https://scholar.google.com',
                        title: result.title,
                        content: `${result.authorInfo}\n\n${result.snippet}\n\n${result.citationCount}`,
                        timestamp: new Date().toISOString(),
                    });
                });
    
                // Return compiled list of results
                return scholarResults;
            });
    
            // Step 7: Return formatted results
            return {
                content: [{
                    type: "text",
                    text: JSON.stringify(results, null, 2)  // Pretty-print JSON results
                }]
            };
        } catch (error) {
            // Handle and format search errors
            return {
                content: [{
                    type: "text",
                    text: `Failed to perform scholar search: ${(error as Error).message}`
                }],
                isError: true
            };
        }
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description should disclose behavioral traits. It only states that it returns structured data with certain fields, but does not mention whether the tool is read-only, has rate limits, requires authentication, or any side effects. This lack of transparency is a gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three sentences, front-loading the core action. It is efficient and easy to parse, though slightly more structure (e.g., bullet points) could improve readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple search tool with one parameter and no output schema, the description covers purpose, usage context, and return structure. However, it lacks details on pagination, result limits, or sorting, which could be useful for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already provides a description for the only parameter ('Academic search query'). The tool description does not add any additional meaning beyond what the schema states, so baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action ('Searches Google Scholar') and specifies the resource ('academic papers and scholarly articles'). It distinguishes itself from sibling tools like search_google by focusing on academic content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool (researching scientific topics, peer-reviewed research, etc.). It does not explicitly mention when not to use or contrast with alternatives, but the focus on academic literature implicitly differentiates it from general web searches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PhialsBasement/mcp-webresearch-stealthified'

If you have feedback or need assistance with the MCP directory API, please join our Discord server