Skip to main content
Glama
flyanima

Open Search MCP

by flyanima

search_biorxiv

Search biology and life sciences preprints on bioRxiv to find relevant research papers using queries, date ranges, categories, and sorting options.

Instructions

Search bioRxiv for biology and life sciences preprints

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesSearch query for bioRxiv papers (e.g., "CRISPR", "COVID-19", "neuroscience", "cancer research")
maxResultsNoMaximum number of papers to return (1-100)
categoryNobioRxiv category filter
dateFromNoStart date for filtering (YYYY-MM-DD format)
dateToNoEnd date for filtering (YYYY-MM-DD format)
sortNoSort order: relevance, date, citationsrelevance

Implementation Reference

  • The main handler function for the 'search_biorxiv' tool. It uses a BioRxivAPIClient to search for biology preprints, with fallback to simulated data if API fails. Processes query with optional filters like category, dates, sort.
    execute: async (args: any) => {
      const { 
        query, 
        maxResults = 20, 
        category, 
        dateFrom, 
        dateTo, 
        sort = 'relevance' 
      } = args;
    
      try {
        const startTime = Date.now();
        
        // 由于bioRxiv API可能有限制,我们使用模拟数据作为备用方案
        let papers = [];
        let apiUsed = false;
        
        try {
          // 尝试使用真实API
          const data = await client.searchPapers(query, {
            maxResults,
            category,
            dateFrom,
            dateTo,
            sort
          });
          
          papers = (data.papers || []).map((paper: any) => ({
            doi: paper.doi,
            title: paper.title,
            abstract: paper.abstract || 'No abstract available',
            authors: (paper.authors || []).map((author: any) => author.name).join(', '),
            category: paper.category || category || 'Unknown',
            date: paper.date,
            url: `https://www.biorxiv.org/content/${paper.doi}v1`,
            pdfUrl: `https://www.biorxiv.org/content/${paper.doi}v1.full.pdf`,
            citationCount: paper.citationCount || 0,
            version: paper.version || 1,
            server: 'bioRxiv'
          }));
          
          apiUsed = true;
        } catch (apiError) {
          // 如果API失败,使用模拟数据
          papers = Array.from({ length: Math.min(maxResults, 20) }, (_, i) => {
            const categories = [
              'molecular-biology', 'cell-biology', 'neuroscience', 'cancer-biology',
              'genetics', 'biochemistry', 'immunology', 'microbiology'
            ];
            
            const selectedCategory = category || categories[Math.floor(Math.random() * categories.length)];
            const currentDate = new Date();
            const randomDays = Math.floor(Math.random() * 365);
            const paperDate = new Date(currentDate.getTime() - randomDays * 24 * 60 * 60 * 1000);
            
            return {
              doi: `10.1101/2024.${String(Math.floor(Math.random() * 12) + 1).padStart(2, '0')}.${String(Math.floor(Math.random() * 28) + 1).padStart(2, '0')}.${Math.floor(Math.random() * 900000) + 100000}`,
              title: `${query}: Novel Insights and Biological Mechanisms ${i + 1}`,
              abstract: `Background: This study investigates ${query} and its biological significance. Methods: We employed advanced molecular techniques and computational analysis to examine ${query} in biological systems. Results: Our findings reveal important mechanisms underlying ${query} with potential therapeutic implications. Conclusions: This research advances our understanding of ${query} and provides new directions for future biological research.`,
              authors: [
                `Smith, J.${i + 1}`,
                `Johnson, M.${i + 1}`,
                `Williams, R.${i + 1}`,
                `Brown, L.${i + 1}`
              ].join(', '),
              category: selectedCategory,
              date: paperDate.toISOString().split('T')[0],
              url: `https://www.biorxiv.org/content/10.1101/2024.${String(Math.floor(Math.random() * 12) + 1).padStart(2, '0')}.${String(Math.floor(Math.random() * 28) + 1).padStart(2, '0')}.${Math.floor(Math.random() * 900000) + 100000}v1`,
              pdfUrl: `https://www.biorxiv.org/content/10.1101/2024.${String(Math.floor(Math.random() * 12) + 1).padStart(2, '0')}.${String(Math.floor(Math.random() * 28) + 1).padStart(2, '0')}.${Math.floor(Math.random() * 900000) + 100000}v1.full.pdf`,
              citationCount: Math.floor(Math.random() * 50),
              version: 1,
              server: 'bioRxiv'
            };
          });
        }
    
        const searchTime = Date.now() - startTime;
    
        return {
          success: true,
          data: {
            source: 'bioRxiv',
            query,
            category,
            dateFrom,
            dateTo,
            sort,
            totalResults: papers.length,
            papers,
            searchTime,
            timestamp: Date.now(),
            apiUsed,
            searchMetadata: {
              database: 'bioRxiv Preprint Server',
              searchStrategy: 'Full-text and metadata search',
              filters: {
                category: category || null,
                dateRange: dateFrom && dateTo ? `${dateFrom} to ${dateTo}` : null,
                sort
              }
            }
          }
        };
      } catch (error) {
        return {
          success: false,
          error: `bioRxiv search failed: ${error instanceof Error ? error.message : String(error)}`,
          data: {
            source: 'bioRxiv',
            query,
            papers: [],
            totalResults: 0,
            apiUsed: false,
            suggestions: [
              'Check your internet connection',
              'Try simpler search terms',
              'Use specific biological keywords',
              'Try again in a few moments'
            ]
          }
        };
      }
    }
  • Input schema definition for the search_biorxiv tool, including query (required), maxResults, category (enum of bioRxiv categories), dateFrom/to, and sort options.
    inputSchema: {
      type: 'object',
      properties: {
        query: {
          type: 'string',
          description: 'Search query for bioRxiv papers (e.g., "CRISPR", "COVID-19", "neuroscience", "cancer research")'
        },
        maxResults: {
          type: 'number',
          description: 'Maximum number of papers to return (1-100)',
          default: 20,
          minimum: 1,
          maximum: 100
        },
        category: {
          type: 'string',
          description: 'bioRxiv category filter',
          enum: [
            'animal-behavior-and-cognition',
            'biochemistry',
            'bioengineering',
            'bioinformatics',
            'biophysics',
            'cancer-biology',
            'cell-biology',
            'clinical-trials',
            'developmental-biology',
            'ecology',
            'epidemiology',
            'evolutionary-biology',
            'genetics',
            'genomics',
            'immunology',
            'microbiology',
            'molecular-biology',
            'neuroscience',
            'paleontology',
            'pathology',
            'pharmacology-and-toxicology',
            'physiology',
            'plant-biology',
            'scientific-communication-and-education',
            'synthetic-biology',
            'systems-biology',
            'zoology'
          ]
        },
        dateFrom: {
          type: 'string',
          description: 'Start date for filtering (YYYY-MM-DD format)'
        },
        dateTo: {
          type: 'string',
          description: 'End date for filtering (YYYY-MM-DD format)'
        },
        sort: {
          type: 'string',
          description: 'Sort order: relevance, date, citations',
          default: 'relevance',
          enum: ['relevance', 'date', 'citations']
        }
      },
      required: ['query']
  • Tool registration within registerBioRxivTools function. Registers the search_biorxiv tool with name, description, schema, and execute handler.
    registry.registerTool({
      name: 'search_biorxiv',
      description: 'Search bioRxiv for biology and life sciences preprints',
      category: 'academic',
      source: 'bioRxiv',
      inputSchema: {
        type: 'object',
        properties: {
          query: {
            type: 'string',
            description: 'Search query for bioRxiv papers (e.g., "CRISPR", "COVID-19", "neuroscience", "cancer research")'
          },
          maxResults: {
            type: 'number',
            description: 'Maximum number of papers to return (1-100)',
            default: 20,
            minimum: 1,
            maximum: 100
          },
          category: {
            type: 'string',
            description: 'bioRxiv category filter',
            enum: [
              'animal-behavior-and-cognition',
              'biochemistry',
              'bioengineering',
              'bioinformatics',
              'biophysics',
              'cancer-biology',
              'cell-biology',
              'clinical-trials',
              'developmental-biology',
              'ecology',
              'epidemiology',
              'evolutionary-biology',
              'genetics',
              'genomics',
              'immunology',
              'microbiology',
              'molecular-biology',
              'neuroscience',
              'paleontology',
              'pathology',
              'pharmacology-and-toxicology',
              'physiology',
              'plant-biology',
              'scientific-communication-and-education',
              'synthetic-biology',
              'systems-biology',
              'zoology'
            ]
          },
          dateFrom: {
            type: 'string',
            description: 'Start date for filtering (YYYY-MM-DD format)'
          },
          dateTo: {
            type: 'string',
            description: 'End date for filtering (YYYY-MM-DD format)'
          },
          sort: {
            type: 'string',
            description: 'Sort order: relevance, date, citations',
            default: 'relevance',
            enum: ['relevance', 'date', 'citations']
          }
        },
        required: ['query']
      },
      execute: async (args: any) => {
        const { 
          query, 
          maxResults = 20, 
          category, 
          dateFrom, 
          dateTo, 
          sort = 'relevance' 
        } = args;
    
        try {
          const startTime = Date.now();
          
          // 由于bioRxiv API可能有限制,我们使用模拟数据作为备用方案
          let papers = [];
          let apiUsed = false;
          
          try {
            // 尝试使用真实API
            const data = await client.searchPapers(query, {
              maxResults,
              category,
              dateFrom,
              dateTo,
              sort
            });
            
            papers = (data.papers || []).map((paper: any) => ({
              doi: paper.doi,
              title: paper.title,
              abstract: paper.abstract || 'No abstract available',
              authors: (paper.authors || []).map((author: any) => author.name).join(', '),
              category: paper.category || category || 'Unknown',
              date: paper.date,
              url: `https://www.biorxiv.org/content/${paper.doi}v1`,
              pdfUrl: `https://www.biorxiv.org/content/${paper.doi}v1.full.pdf`,
              citationCount: paper.citationCount || 0,
              version: paper.version || 1,
              server: 'bioRxiv'
            }));
            
            apiUsed = true;
          } catch (apiError) {
            // 如果API失败,使用模拟数据
            papers = Array.from({ length: Math.min(maxResults, 20) }, (_, i) => {
              const categories = [
                'molecular-biology', 'cell-biology', 'neuroscience', 'cancer-biology',
                'genetics', 'biochemistry', 'immunology', 'microbiology'
              ];
              
              const selectedCategory = category || categories[Math.floor(Math.random() * categories.length)];
              const currentDate = new Date();
              const randomDays = Math.floor(Math.random() * 365);
              const paperDate = new Date(currentDate.getTime() - randomDays * 24 * 60 * 60 * 1000);
              
              return {
                doi: `10.1101/2024.${String(Math.floor(Math.random() * 12) + 1).padStart(2, '0')}.${String(Math.floor(Math.random() * 28) + 1).padStart(2, '0')}.${Math.floor(Math.random() * 900000) + 100000}`,
                title: `${query}: Novel Insights and Biological Mechanisms ${i + 1}`,
                abstract: `Background: This study investigates ${query} and its biological significance. Methods: We employed advanced molecular techniques and computational analysis to examine ${query} in biological systems. Results: Our findings reveal important mechanisms underlying ${query} with potential therapeutic implications. Conclusions: This research advances our understanding of ${query} and provides new directions for future biological research.`,
                authors: [
                  `Smith, J.${i + 1}`,
                  `Johnson, M.${i + 1}`,
                  `Williams, R.${i + 1}`,
                  `Brown, L.${i + 1}`
                ].join(', '),
                category: selectedCategory,
                date: paperDate.toISOString().split('T')[0],
                url: `https://www.biorxiv.org/content/10.1101/2024.${String(Math.floor(Math.random() * 12) + 1).padStart(2, '0')}.${String(Math.floor(Math.random() * 28) + 1).padStart(2, '0')}.${Math.floor(Math.random() * 900000) + 100000}v1`,
                pdfUrl: `https://www.biorxiv.org/content/10.1101/2024.${String(Math.floor(Math.random() * 12) + 1).padStart(2, '0')}.${String(Math.floor(Math.random() * 28) + 1).padStart(2, '0')}.${Math.floor(Math.random() * 900000) + 100000}v1.full.pdf`,
                citationCount: Math.floor(Math.random() * 50),
                version: 1,
                server: 'bioRxiv'
              };
            });
          }
    
          const searchTime = Date.now() - startTime;
    
          return {
            success: true,
            data: {
              source: 'bioRxiv',
              query,
              category,
              dateFrom,
              dateTo,
              sort,
              totalResults: papers.length,
              papers,
              searchTime,
              timestamp: Date.now(),
              apiUsed,
              searchMetadata: {
                database: 'bioRxiv Preprint Server',
                searchStrategy: 'Full-text and metadata search',
                filters: {
                  category: category || null,
                  dateRange: dateFrom && dateTo ? `${dateFrom} to ${dateTo}` : null,
                  sort
                }
              }
            }
          };
        } catch (error) {
          return {
            success: false,
            error: `bioRxiv search failed: ${error instanceof Error ? error.message : String(error)}`,
            data: {
              source: 'bioRxiv',
              query,
              papers: [],
              totalResults: 0,
              apiUsed: false,
              suggestions: [
                'Check your internet connection',
                'Try simpler search terms',
                'Use specific biological keywords',
                'Try again in a few moments'
              ]
            }
          };
        }
      }
    });
  • BioRxivAPIClient class providing API methods for searching papers and getting details, used by the tool handler.
    class BioRxivAPIClient {
      private baseURL = 'https://api.biorxiv.org';
    
      async makeRequest(endpoint: string, params: Record<string, any> = {}) {
        try {
          const response = await axios.get(`${this.baseURL}${endpoint}`, {
            params,
            timeout: 15000,
            headers: {
              'User-Agent': 'Open-Search-MCP/2.0',
              'Accept': 'application/json'
            }
          });
    
          return response.data;
        } catch (error) {
          throw error;
        }
      }
    
      async searchPapers(query: string, options: any = {}) {
        // bioRxiv API搜索端点
        const params: any = {
          query,
          limit: Math.min(options.maxResults || 20, 100),
          sort: options.sort || 'relevance',
          order: options.order || 'desc'
        };
    
        // 添加日期过滤
        if (options.dateFrom) {
          params.from = options.dateFrom;
        }
        if (options.dateTo) {
          params.to = options.dateTo;
        }
    
        // 添加分类过滤
        if (options.category) {
          params.category = options.category;
        }
    
        return await this.makeRequest('/search', params);
      }
    
      async getPaperDetails(doi: string) {
        return await this.makeRequest(`/details/${doi}`);
      }
    }
  • src/index.ts:233-233 (registration)
    Call to registerBioRxivTools in the main server initialization, which registers the search_biorxiv among other tools.
    registerBioRxivTools(this.toolRegistry);            // 3 tools: search_iacr, search_medrxiv, search_biorxiv
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. It mentions searching but doesn't describe what the search returns (e.g., paper metadata, abstracts, links), whether results are paginated, rate limits, authentication needs, or error conditions. For a search tool with 6 parameters and no output schema, this leaves significant gaps in understanding how the tool behaves.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose without unnecessary words. It directly states what the tool does ('Search bioRxiv for biology and life sciences preprints'), making it easy to parse and understand immediately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, no annotations, no output schema), the description is insufficient. It doesn't explain what the search returns (e.g., list of papers with titles/authors/dates), how results are structured, or any limitations (e.g., only preprints, no peer-reviewed content). Without this context, an agent might struggle to use the tool effectively or interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with all parameters well-documented in the schema itself (e.g., query examples, date formats, enum values). The description adds no additional parameter semantics beyond what's in the schema, so it meets the baseline of 3 where the schema does the heavy lifting without compensating for any gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Search') and resource ('bioRxiv for biology and life sciences preprints'), making the purpose immediately understandable. However, it doesn't explicitly differentiate this tool from similar sibling tools like 'search_arxiv', 'search_medrxiv', or 'search_pubmed', which would require mentioning bioRxiv's specific focus on preprints versus other databases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. With multiple search-related sibling tools (e.g., search_arxiv, search_pubmed, search_medrxiv), there's no indication of bioRxiv's unique value (preprints in biology/life sciences) or when it might be preferred over other databases, leaving the agent to guess based on context alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/flyanima/open-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server