search_google_scholar
Search academic papers on Google Scholar by query, author, or publication year to find relevant research publications.
Instructions
Search Google Scholar for academic papers using web scraping
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Search query string | |
| maxResults | No | Maximum number of results to return | |
| yearLow | No | Earliest publication year | |
| yearHigh | No | Latest publication year | |
| author | No | Author name filter |
Implementation Reference
- Core implementation of the search_google_scholar tool: the search() method in GoogleScholarSearcher class that performs web scraping on Google Scholar using axios and cheerio, parses results, handles pagination, anti-detection (random delays, user agents), extracts metadata like title, authors, year, citations, etc.
async search(query: string, options: GoogleScholarOptions = {}): Promise<Paper[]> { logDebug(`Google Scholar Search: query="${query}"`); try { const papers: Paper[] = []; let start = 0; const resultsPerPage = 10; const maxResults = options.maxResults || 10; while (papers.length < maxResults) { // 添加随机延迟避免检测 await this.randomDelay(); const params = this.buildSearchParams(query, start, options); const response = await this.makeScholarRequest(params); if (response.status !== 200) { logDebug(`Google Scholar HTTP Error: ${response.status}`); break; } const $ = cheerio.load(response.data); const results = $('.gs_ri'); // 搜索结果容器 if (results.length === 0) { logDebug('Google Scholar: No more results found'); break; } logDebug(`Google Scholar: Found ${results.length} results on page`); // 解析每个结果 results.each((index, element) => { if (papers.length >= maxResults) return false; // 停止遍历 const paper = this.parseScholarResult($, $(element)); if (paper) { papers.push(paper); } }); start += resultsPerPage; } logDebug(`Google Scholar Results: Found ${papers.length} papers`); return papers; } catch (error) { this.handleHttpError(error, 'search'); } } - src/mcp/handleToolCall.ts:240-256 (handler)MCP tool call handler: switch case that parses args, calls searchers.googlescholar.search(), converts results to JSON dicts, and returns formatted text response.
case 'search_google_scholar': { const { query, maxResults, yearLow, yearHigh, author } = args; const results = await searchers.googlescholar.search(query, { maxResults, yearLow, yearHigh, author } as any); return jsonTextResponse( `Found ${results.length} Google Scholar papers.\n\n${JSON.stringify( results.map((paper: Paper) => PaperFactory.toDict(paper)), null, 2 )}` ); } - src/mcp/schemas.ts:117-125 (schema)Zod schema for validating input arguments to search_google_scholar tool.
export const SearchGoogleScholarSchema = z .object({ query: z.string().min(1), maxResults: z.number().int().min(1).max(20).optional().default(10), yearLow: z.number().int().optional(), yearHigh: z.number().int().optional(), author: z.string().optional() }) .strip(); - src/mcp/tools.ts:270-296 (registration)MCP tool registration: definition of the tool including name, description, and inputSchema exported in TOOLS array.
name: 'search_google_scholar', description: 'Search Google Scholar for academic papers using web scraping', inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'Search query string' }, maxResults: { type: 'number', minimum: 1, maximum: 20, description: 'Maximum number of results to return' }, yearLow: { type: 'number', description: 'Earliest publication year' }, yearHigh: { type: 'number', description: 'Latest publication year' }, author: { type: 'string', description: 'Author name filter' } }, required: ['query'] } - src/mcp/searchers.ts:49-49 (helper)Instantiation of GoogleScholarSearcher as searchers.googlescholar (also aliased as scholar).
const googleScholarSearcher = new GoogleScholarSearcher();