API_DOCUMENTATION.mdā¢12.9 kB
# API Documentation
## Table of Contents
- [arXiv Server API](#arxiv-server-api)
- [Semantic Scholar Server API](#semantic-scholar-server-api)
- [PubMed Server API](#pubmed-server-api)
- [Response Format](#response-format)
- [Error Handling](#error-handling)
---
## arXiv Server API
### Base Information
- **API Endpoint**: http://export.arxiv.org/api/query
- **Rate Limit**: 3 requests/second (built-in)
- **Authentication**: None required
### Tools
#### 1. search_arxiv
Search for academic papers on arXiv.
**Input Schema:**
```typescript
{
query: string; // Required: Search keywords or phrases
maxResults?: number; // Optional: Max results (default: 10, max: 100)
startYear?: number; // Optional: Filter papers published after this year
endYear?: number; // Optional: Filter papers published before this year
author?: string; // Optional: Filter by author name
sortBy?: string; // Optional: 'relevance' | 'lastUpdatedDate' | 'submittedDate' (default: 'relevance')
}
```
**Response Schema:**
```typescript
Array<{
id: string; // arXiv ID (e.g., "2301.12345v1")
title: string; // Paper title
authors: string[]; // List of author names
abstract: string; // Full abstract text
published: string; // Publication date (ISO 8601)
updated: string; // Last update date (ISO 8601)
url: string; // Paper URL (https://arxiv.org/abs/...)
pdfUrl: string; // PDF URL (https://arxiv.org/pdf/...)
categories: string[]; // All categories (e.g., ["cs.LG", "stat.ML"])
primaryCategory: string; // Primary category (e.g., "cs.LG")
}>
```
**Example Usage:**
```typescript
{
query: "deep learning computer vision",
maxResults: 5,
startYear: 2023,
sortBy: "relevance"
}
```
#### 2. get_arxiv_paper
Retrieve a specific paper by arXiv ID.
**Input Schema:**
```typescript
{
arxivId: string; // Required: arXiv paper ID (e.g., '2301.12345' or 'arXiv:2301.12345')
}
```
**Response Schema:**
```typescript
{
id: string;
title: string;
authors: string[];
abstract: string;
published: string;
updated: string;
url: string;
pdfUrl: string;
categories: string[];
primaryCategory: string;
}
// Same structure as single paper from search_arxiv
```
#### 3. arxiv_to_bibtex
Convert an arXiv paper to BibTeX format.
**Input Schema:**
```typescript
{
arxivId: string; // Required: arXiv paper ID
}
```
**Response Schema:**
```typescript
string // BibTeX formatted citation
```
**Example BibTeX Output:**
```bibtex
@article{author2023paper,
title = {Paper Title},
author = Author One and Author Two,
year = 2023,
eprint = 2301.12345v1,
archivePrefix = arXiv,
primaryClass = cs.AI,
url = http://arxiv.org/abs/2301.12345v1,
abstract = {Full abstract text...},
}
```
---
## Semantic Scholar Server API
### Base Information
- **API Endpoint**: https://api.semanticscholar.org/graph/v1
- **Rate Limit**: 1 request/second without API key, 10 requests/second with API key
- **Authentication**: Optional API key via `SEMANTIC_SCHOLAR_API_KEY` environment variable
### Tools
#### 1. search_semantic_scholar
Search for papers with rich citation metadata.
**Input Schema:**
```typescript
{
query: string; // Required: Search keywords or phrases
maxResults?: number; // Optional: Max results (default: 10, max: 100)
startYear?: number; // Optional: Filter papers published after this year
endYear?: number; // Optional: Filter papers published before this year
}
```
**Response Schema:**
```typescript
Array<{
paperId: string; // Semantic Scholar paper ID
title: string; // Paper title
abstract: string | null; // Abstract (may be null)
year: number | null; // Publication year (may be null)
authors: Array<{
authorId: string; // Author ID
name: string; // Author name
}>;
citationCount: number; // Total citations
referenceCount: number; // Total references
influentialCitationCount: number; // Influential citations
url: string; // Semantic Scholar URL
venue: string | null; // Publication venue (may be null)
publicationDate: string | null; // Publication date (may be null)
}>
```
**Example Usage:**
```typescript
{
query: "graph neural networks",
maxResults: 10,
startYear: 2022
}
```
#### 2. get_semantic_scholar_paper
Get paper by Semantic Scholar paper ID or DOI.
**Input Schema:**
```typescript
{
identifier: string; // Required: Semantic Scholar paper ID or DOI
}
```
**Response Schema:**
```typescript
{
paperId: string;
title: string;
abstract: string | null;
year: number | null;
authors: Array<{ authorId: string; name: string }>;
citationCount: number;
referenceCount: number;
influentialCitationCount: number;
url: string;
venue: string | null;
publicationDate: string | null;
}
// Same structure as single paper from search
```
**Example with DOI:**
```typescript
{
identifier: "10.1234/example.doi"
}
```
**Example with Paper ID:**
```typescript
{
identifier: "649def34f8be52c8b66281af98ae884c09aef38b"
}
```
#### 3. get_paper_citations
Get papers that cite a specific paper (citation analysis).
**Input Schema:**
```typescript
{
paperId: string; // Required: Semantic Scholar paper ID
maxResults?: number; // Optional: Max citing papers (default: 10, max: 100)
}
```
**Response Schema:**
```typescript
Array<{
paperId: string;
title: string;
abstract: string | null;
year: number | null;
authors: Array<{ authorId: string; name: string }>;
citationCount: number;
referenceCount: number;
influentialCitationCount: number;
url: string;
venue: string | null;
publicationDate: string | null;
}>
// Array of papers citing the specified paper
```
#### 4. semantic_scholar_to_bibtex
Convert paper to BibTeX format.
**Input Schema:**
```typescript
{
identifier: string; // Required: Semantic Scholar paper ID or DOI
}
```
**Response Schema:**
```typescript
string // BibTeX formatted citation
```
**Example BibTeX Output:**
```bibtex
@article{author2023paper,
title = {Paper Title},
author = Author One and Author Two,
year = 2023,
url = https://www.semanticscholar.org/paper/649def34f8be52c8b66281af98ae884c09aef38b,
abstract = {Abstract text...},
journal = Conference/Journal Name,
}
```
---
## PubMed Server API
### Base Information
- **API Endpoint**: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/
- **Rate Limit**: 3 requests/second without API key, 10 requests/second with API key
- **Authentication**: Optional via `PUBMED_API_KEY` and `PUBMED_EMAIL` environment variables
### Tools
#### 1. search_pubmed
Search biomedical and life sciences papers.
**Input Schema:**
```typescript
{
query: string; // Required: Search query (supports MeSH terms like "diabetes[MeSH]")
maxResults?: number; // Optional: Max results (default: 10, max: 100)
startYear?: number; // Optional: Filter papers published after this year
endYear?: number; // Optional: Filter papers published before this year
}
```
**Response Schema:**
```typescript
Array<{
pmid: string; // PubMed ID
title: string; // Paper title
abstract: string; // Full abstract text
authors: string[]; // List of author names (e.g., ["Smith J", "Jones M"])
journal: string; // Journal name
year: string; // Publication year
doi: string | null; // DOI (may be null)
url: string; // PubMed URL
publicationTypes: string[]; // Types (e.g., ["Journal Article", "Review"])
meshTerms: string[]; // Medical Subject Headings
}>
```
**Example with MeSH Terms:**
```typescript
{
query: "diabetes[MeSH] AND treatment",
maxResults: 5,
startYear: 2022
}
```
#### 2. get_pubmed_paper
Get detailed information about a specific PubMed paper by PMID.
**Input Schema:**
```typescript
{
pmid: string; // Required: PubMed ID (e.g., "12345678")
}
```
**Response Schema:**
```typescript
{
pmid: string; // PubMed ID
title: string; // Paper title
abstract: string; // Full abstract text
authors: string[]; // List of author names
journal: string; // Journal name
year: string; // Publication year
doi: string | null; // DOI (may be null)
url: string; // PubMed URL
publicationTypes: string[]; // Publication types
meshTerms: string[]; // Medical Subject Headings
}
```
#### 3. pubmed_to_bibtex
Convert PubMed paper metadata to BibTeX format.
**Input Schema:**
```typescript
{
pmid: string; // Required: PubMed ID to convert
}
```
**Response Schema:**
```typescript
string // BibTeX formatted citation
```
**Example BibTeX Output:**
```bibtex
@article{pmid12345678,
title = {Paper Title},
author = Smith, J. and Jones, M.,
year = 2023,
journal = Nature Medicine,
doi = 10.1234/nm.2023.1234,
url = https://pubmed.ncbi.nlm.nih.gov/12345678/,
pmid = 12345678,
abstract = {Full abstract text...},
}
```
---
## Response Format
All MCP tool responses follow the Model Context Protocol standard structure:
**Success Response:**
```typescript
{
content: [
{
type: "text";
text: string; // JSON stringified data or plain text
}
]
}
```
For search and get operations, the `text` field contains JSON stringified paper data matching the TypeScript schemas documented above.
For BibTeX conversion operations, the `text` field contains the formatted BibTeX citation as a plain string.
**Error Response:**
```typescript
{
content: [
{
type: "text";
text: string; // JSON stringified error object
}
];
isError: true;
}
```
---
## Error Handling
### Common Error Types
**Not Found (404)**
- Occurs when: Paper ID, DOI, or PMID doesn't exist in the database
- Response format: `{"error": "Paper not found"}`
- Example: Invalid arXiv ID, non-existent Semantic Scholar paper ID, or invalid PMID
**Rate Limit Exceeded (429)**
- Occurs when: Too many requests sent in a short time window
- Behavior: Server automatically throttles requests and waits before retrying
- Rate limits:
- arXiv: 3 requests/second (no API key needed)
- Semantic Scholar: 1 request/second (without key) or 10 requests/second (with key)
- PubMed: 3 requests/second (without key) or 10 requests/second (with key)
**Invalid Parameters (400)**
- Occurs when: Missing required fields or invalid parameter values
- Response format: `{"error": "Invalid parameter: <details>"}`
- Common causes:
- Missing required `query`, `pmid`, `arxivId`, or `identifier` field
- Invalid `maxResults` (must be 1-100)
- Invalid year filters (must be valid numbers)
- Response: `{"error": "No arguments provided"}` or specific validation error
**API Errors (500/503)**
- Occurs when: External API unavailable or experiencing issues
- Behavior: Network timeout after 30 seconds
- Response format: `{"error": "API error: <original error message>"}`
- Common causes: API downtime, network issues, malformed API responses
**Parsing Errors**
- Occurs when: API returns unexpected data format
- Example: PubMed XML structure changes, missing required fields
- Response format: `{"error": "Failed to parse response: <details>"}`
### Best Practices
1. **Handle Rate Limits**: Built-in throttling ensures compliance, but consider caching frequently accessed papers
2. **Validate Input**: Check required parameters (query, pmid, arxivId, identifier) before making requests
3. **Parse Responses**: All data returned as JSON-stringified strings in MCP `text` content field
4. **Check for Errors**: Always check for `isError: true` flag in response object
5. **Retry Logic**: Implement exponential backoff for transient failures (503 errors, network timeouts)
6. **Use API Keys**: For Semantic Scholar and PubMed, obtain API keys to increase rate limits from 1-3/sec to 10/sec
---
## Rate Limit Summary
| Server | Without API Key | With API Key | Environment Variable | Recommendation |
|--------|-----------------|--------------|---------------------|----------------|
| arXiv | 3 req/sec | 3 req/sec | N/A | No key needed |
| Semantic Scholar | 1 req/sec | 10 req/sec | `SEMANTIC_SCHOLAR_API_KEY` | Highly recommended |
| PubMed | 3 req/sec | 10 req/sec | `PUBMED_API_KEY` + `PUBMED_EMAIL` | Recommended |
**Note**: Rate limiting is handled automatically by the servers using the `SimpleRateLimiter` class. Requests are queued and processed at the appropriate rate.
---
## Additional Resources
- [arXiv API Documentation](https://arxiv.org/help/api)
- [Semantic Scholar API Docs](https://api.semanticscholar.org/)
- [PubMed E-utilities](https://www.ncbi.nlm.nih.gov/books/NBK25501/)
- [MCP Specification](https://modelcontextprotocol.io/)