Skip to main content
Glama

Zotero MCP Server

A Model Context Protocol (MCP) server for Zotero that provides semantic search capabilities using PostgreSQL with pg-vector and OpenAI/Ollama embeddings.

This is a fork of the

THIS IS NOT THE OFFICIAL PROJECT AND MY MODIFICATIONY MAY HAVE BUGS. I just use this version for my personal research projects.

At the moment I use the version in this repository against my own OpenAI compatible API gateway.

Features

  • Full Zotero Integration: Access your Zotero library through MCP tools

  • Semantic Search: AI-powered semantic search using PostgreSQL + pg-vector

  • Multiple Embedding Providers: Support for OpenAI and Ollama embeddings

  • Lightweight Architecture: Removed heavy ML dependencies (torch, transformers)

  • High Performance: PostgreSQL backend with optimized vector operations

  • Flexible Configuration: Support for local and remote database instances

Quick Start

Prerequisites

  • Python 3.10+

  • PostgreSQL 15+ with pg-vector extension

  • Zotero desktop application or Zotero Web API credentials

  • OpenAI API key or Ollama installation

Installation

pip install -e .

PostgreSQL Setup

If you have access to a PostgreSQL instance with pg-vector:

-- Connect to your PostgreSQL instance CREATE DATABASE zotero_mcp; CREATE USER zotero_user WITH PASSWORD 'your_password'; GRANT ALL PRIVILEGES ON DATABASE zotero_mcp TO zotero_user; -- Enable pg-vector extension \c zotero_mcp CREATE EXTENSION vector;

Configuration

Run the interactive setup:

zotero-mcp setup

Usage with Claude Desktop

{ "mcpServers": { "zotero": { "command": "/path/to/zotero-mcp", "env": { "ZOTERO_DB_HOST": "your_host", "ZOTERO_DB_NAME": "zotero_mcp", "ZOTERO_EMBEDDING_PROVIDER": "ollama", "OLLAMA_HOST": "your_ollama_host" } } } }

Configuration

Database Configuration

Create ~/.config/zotero-mcp/config.json:

{ "database": { "host": "localhost", "port": 5432, "database": "zotero_mcp", "username": "zotero_user", "password": "your_password", "schema": "public", "pool_size": 5 }, "embedding": { "provider": "ollama", "openai": { "api_key": "sk-...", "model": "text-embedding-3-small", "batch_size": 100 }, "ollama": { "host": "192.168.1.189:8182", "model": "nomic-embed-text", "timeout": 60 } }, "chunking": { "chunk_size": 1000, "overlap": 100, "min_chunk_size": 100, "max_chunks_per_item": 10, "chunking_strategy": "sentences" }, "semantic_search": { "similarity_threshold": 0.7, "max_results": 50, "update_config": { "auto_update": false, "update_frequency": "manual", "batch_size": 50, "parallel_workers": 4 } } }

Available Tools

Core Zotero Tools

  • zotero_search_items - Search items by text query

  • zotero_search_by_tag - Search items by tags

  • zotero_get_item_metadata - Get item details and metadata

  • zotero_get_item_fulltext - Extract full text from attachments

  • zotero_get_collections - List all collections

  • zotero_get_collection_items - Get items in a collection

  • zotero_get_recent - Get recently added items

  • zotero_get_tags - List all tags

  • zotero_batch_update_tags - Bulk update tags

Semantic Search Tools

  • zotero_semantic_search - AI-powered semantic search

  • zotero_update_search_database - Update embedding database

  • zotero_get_search_database_status - Check database status

Advanced Tools

  • zotero_get_annotations - Extract annotations from PDFs

  • zotero_get_notes - Retrieve notes

  • zotero_search_notes - Search through notes

  • zotero_create_note - Create new notes

  • zotero_advanced_search - Complex multi-criteria search

The semantic search uses PostgreSQL with pg-vector for efficient vector similarity search:

Database Population

# Initial database population zotero-mcp update-db --force-rebuild # Incremental updates zotero-mcp update-db # Update with limit (for testing) zotero-mcp update-db --limit 100 # Check status zotero-mcp status

Embedding Providers

{ "embedding": { "provider": "openai", "openai": { "api_key": "sk-...", "model": "text-embedding-3-small", "batch_size": 100, "rate_limit_rpm": 3000 } } }

Models Available:

  • text-embedding-3-small (1536 dimensions) - Fast and efficient

  • text-embedding-3-large (3072 dimensions) - Higher quality

  • text-embedding-ada-002 (1536 dimensions) - Legacy model

Ollama (Local)

{ "embedding": { "provider": "ollama", "ollama": { "host": "http://localhost:11434", "model": "nomic-embed-text", "timeout": 60 } } }

Popular Models:

  • nomic-embed-text - Good general purpose embeddings

  • all-minilm - Lightweight and fast

  • mxbai-embed-large - High quality embeddings

To install Ollama models:

ollama pull nomic-embed-text

Architecture

Component Overview

┌─────────────────┐ ┌─────────────────┐ │ Claude MCP │───▶│ FastMCP Server │ │ Client │ │ (server.py) │ └─────────────────┘ └─────────────────┘ │ ▼ ┌─────────────────┐ │ Semantic Search │ │ (semantic_search.py) │ └─────────────────┘ │ ┌──────────┴──────────┐ ▼ ▼ ┌──────────────┐ ┌──────────────┐ │ Vector Client│ │ Embedding │ │(vector_client)│ │ Service │ └──────────────┘ │(embedding_ │ │ │ service.py) │ ▼ └──────────────┘ ┌──────────────┐ │ │ PostgreSQL │ ▼ │ + pgvector │ ┌──────────────┐ └──────────────┘ │ OpenAI/Ollama│ │ APIs │ └──────────────┘

Database Schema

-- Core embeddings table CREATE TABLE zotero_embeddings ( id SERIAL PRIMARY KEY, item_key VARCHAR(50) UNIQUE NOT NULL, item_type VARCHAR(50) NOT NULL, title TEXT, content TEXT NOT NULL, content_hash VARCHAR(64) NOT NULL, embedding vector(1536), embedding_model VARCHAR(100) NOT NULL, embedding_provider VARCHAR(50) NOT NULL, metadata JSONB NOT NULL DEFAULT '{}', created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP ); -- Optimized indexes CREATE INDEX idx_zotero_embedding_cosine ON zotero_embeddings USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); CREATE INDEX idx_zotero_metadata_gin ON zotero_embeddings USING gin(metadata);

License

MIT License - see LICENSE file for details.

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tspspi/zotero-mcp-postgres-ollama-fulltext'

If you have feedback or need assistance with the MCP directory API, please join our Discord server