Node.js MCP Server for Gallica/BnF
A high-quality TypeScript/Node.js MCP (Model Context Protocol) server for accessing the Gallica digital library of the Bibliothèque nationale de France (BnF). This server faithfully reproduces all functionality from the Python MCP server and extends it with additional features for IIIF images, OCR text, and item metadata.
Features
Core Search Tools (8 tools - matching Python)
search_by_title - Search documents by title with exact match option
search_by_author - Search documents by author with exact match option
search_by_subject - Search documents by subject with exact match option
search_by_date - Search documents by date (YYYY, YYYY-MM, or YYYY-MM-DD)
search_by_document_type - Search by document type (monographie, periodique, image, etc.)
advanced_search - Custom CQL query syntax for complex searches
natural_language_search - Natural language search across all fields
sequential_reporting - Multi-step sequential report generation with source management
Extended Tools (4 new tools)
get_item_details - Get full metadata for an item by ARK identifier
get_item_pages - Enumerate pages of a document with IIIF URLs
get_page_image - Generate IIIF image URLs for specific pages
get_page_text - Retrieve OCR/text content (ALTO, plain text)
Installation
Prerequisites
Node.js 18.0 or higher
npm or yarn
Steps
Clone or download this repository
Install dependencies:
cd node-mcp-bnf npm installBuild the project:
npm run build
Usage
Local Development
Option 1: HTTP Server (Recommended for Development)
Run the server on a local HTTP port for testing:
This starts the server on http://localhost:3000 (or the port specified in PORT environment variable).
The server will:
Listen on the specified port
Accept MCP requests via HTTP/SSE
Log all activity to the console
You can test it by connecting an MCP client to http://localhost:3000/message.
Option 2: STDIO Mode (For MCP Clients)
The server can be used locally with Claude Desktop or Cursor MCP via STDIO.
Run in STDIO mode:
Claude Desktop Configuration
Add to your Claude Desktop configuration file (usually ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
Cursor MCP Configuration
Add to your Cursor MCP settings:
Development Scripts
npm run dev- Run in STDIO mode (for MCP clients like Cursor/Claude Desktop)npm run dev:http- Run HTTP server on local port (default: 3000)npm start- Run compiled STDIO versionnpm run start:http- Run compiled HTTP server version
Environment Variables for HTTP Server:
PORT- Port to listen on (default: 3000)LOG_LEVEL- Logging level (error, warn, info, debug)
Deployment on Vercel
Prerequisites
Vercel account
Vercel CLI installed (
npm i -g vercel)
Steps
Create
{ "functions": { "src/httpServer.ts": { "runtime": "nodejs18.x" } }, "routes": [ { "src": "/(.*)", "dest": "src/httpServer.ts" } ] }Set environment variables in Vercel dashboard:
GALLICA_BASE_URL(optional, default:https://gallica.bnf.fr)GALLICA_SRU_URL(optional, default:https://gallica.bnf.fr/SRU)LOG_LEVEL(optional, default:info)MCP_ICON_URL(optional, absolute URL to icon - defaults to/icon.svgrelative to server URL)
Deploy:
vercel deployConfigure Cursor/Claude Desktop to use HTTP endpoint:
{ "mcpServers": { "gallica-bnf": { "url": "https://your-vercel-app.vercel.app" } } }
API Documentation
Search Tools
search_by_title
Search for documents by title.
Parameters:
title(string, required): The title to search forexact_match(boolean, optional): If true, search for exact titlemax_results(number, optional, default: 10): Maximum results (1-50)start_record(number, optional, default: 1): Starting record for pagination
Example:
search_by_author
Search for documents by author.
Parameters:
author(string, required): The author nameexact_match(boolean, optional): If true, search for exact author namemax_results(number, optional, default: 10)start_record(number, optional, default: 1)
search_by_subject
Search for documents by subject/keywords.
Parameters:
subject(string, required): The subject to search forexact_match(boolean, optional)max_results(number, optional, default: 10)start_record(number, optional, default: 1)
search_by_date
Search for documents by publication date.
Parameters:
date(string, required): Date in format YYYY, YYYY-MM, or YYYY-MM-DDmax_results(number, optional, default: 10)start_record(number, optional, default: 1)
Example:
search_by_document_type
Search for documents by type.
Parameters:
doc_type(string, required): Document type (monographie, periodique, image, manuscrit, carte, musique, etc.)max_results(number, optional, default: 10)start_record(number, optional, default: 1)
advanced_search
Perform advanced search with custom CQL query.
Parameters:
query(string, required): CQL query stringmax_results(number, optional, default: 10)start_record(number, optional, default: 1)
Example:
natural_language_search
Natural language search across all fields.
Parameters:
query(string, required): Natural language search querymax_results(number, optional, default: 10)start_record(number, optional, default: 1)
Extended Tools
get_item_details
Get full metadata for an item.
Parameters:
ark(string, required): ARK identifier (e.g., "ark:/12148/bpt6k123456" or "bpt6k123456")
Returns:
Bibliographic data (title, creator, date, etc.)
Available formats (iiif, image, text, alto)
IIIF manifest URL
Gallica URL
get_item_pages
Enumerate pages of a document.
Parameters:
ark(string, required): ARK identifierpage(number, optional): Get specific page numberpage_size(number, optional): Get first N pagespage_range(array, optional): Get pages in range [start, end]
Returns:
Array of page info with:
Page number
Label
IIIF image URL
Text availability flag
Thumbnail URL
get_page_image
Get IIIF image URL for a specific page.
Parameters:
ark(string, required): ARK identifierpage(number, required): Page numbersize(string, optional): Image size (e.g., "full", "200,", "500,500")region(string, optional): Image region (e.g., "full", "x,y,w,h")
Returns:
IIIF image URL
Thumbnail URL
get_page_text
Retrieve OCR/text content for a page.
Parameters:
ark(string, required): ARK identifierpage(number, required): Page numberformat(string, optional): Text format ("plain", "alto", "tei")
Returns:
Text content (or null if not available)
Availability flag
Sequential Reporting Tool
sequential_reporting
Generate research reports in a sequential, step-by-step manner.
Workflow:
Initialize:
{ "topic": "Impressionnisme en France", "page_count": 4, "source_count": 10, "include_graphics": true }Search sources:
{ "search_sources": true }Create bibliography:
{ "section_number": 1, "total_sections": 8, "title": "Bibliography", "content": "...", "is_bibliography": true, "sources_used": [1, 2, 3], "next_section_needed": true }Write sections sequentially:
{ "section_number": 2, "total_sections": 8, "title": "Introduction", "content": "...", "sources_used": [1, 2], "next_section_needed": true }Complete report:
{ "section_number": 8, "total_sections": 8, "title": "Conclusion", "content": "...", "sources_used": [5, 6], "next_section_needed": false }
Configuration
Environment Variables
GALLICA_BASE_URL- Base URL for Gallica API (default:https://gallica.bnf.fr)GALLICA_SRU_URL- SRU search endpoint (default:https://gallica.bnf.fr/SRU)LOG_LEVEL- Logging level:error,warn,info,debug(default:info)HTTP_TIMEOUT- HTTP request timeout in milliseconds (default: 30000)HTTP_RETRIES- Maximum number of retry attempts (default: 3)DEFAULT_MAX_RECORDS- Default maximum search results (default: 10)DEFAULT_START_RECORD- Default starting record for pagination (default: 1)
Development
Project Structure
Building
Testing
Development Mode
Documentation
Python Architecture Analysis - Analysis of the original Python server
Gallica API Notes - Detailed API documentation
Node.js Architecture - Architecture and design of this server
License
MIT
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Acknowledgments
Based on the Python MCP server by Kryzo
Uses the Model Context Protocol SDK
Built for the Gallica digital library of the Bibliothèque nationale de France