Skip to main content
Glama

delete_document

Remove documents from Paperlib MCP by deleting database entries and optionally associated PDF files to manage your academic literature collection.

Instructions

删除指定文档

从数据库删除文档及其所有关联数据(chunks、embeddings、导入记录等)。 可选择同时删除 MinIO 中的 PDF 文件。

Args: doc_id: 文档的唯一标识符 also_delete_object: 是否同时删除 MinIO 中的 PDF 文件,默认 False

Returns: 删除结果,包含删除的记录数量

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
doc_idYes
also_delete_objectNo

Implementation Reference

  • The core handler function for the 'delete_document' tool. It handles deletion of the document from the database tables (ingest_jobs, documents, chunks, chunk_embeddings) and optionally deletes the associated PDF file from MinIO storage using delete_object. Returns a success dict with stats.
    @mcp.tool()
    def delete_document(
        doc_id: str,
        also_delete_object: bool = False,
    ) -> dict[str, Any]:
        """删除指定文档
        
        从数据库删除文档及其所有关联数据(chunks、embeddings、导入记录等)。
        可选择同时删除 MinIO 中的 PDF 文件。
        
        Args:
            doc_id: 文档的唯一标识符
            also_delete_object: 是否同时删除 MinIO 中的 PDF 文件,默认 False
            
        Returns:
            删除结果,包含删除的记录数量
        """
        try:
            # 先获取文档信息
            doc = query_one(
                "SELECT pdf_key FROM documents WHERE doc_id = %s",
                (doc_id,)
            )
            
            if not doc:
                return {
                    "success": False,
                    "error": f"Document not found: {doc_id}",
                    "doc_id": doc_id,
                }
            
            pdf_key = doc["pdf_key"]
            
            # 统计将要删除的数据
            stats = query_one(
                """
                SELECT 
                    (SELECT COUNT(*) FROM chunks WHERE doc_id = %s) as chunk_count,
                    (SELECT COUNT(*) FROM chunk_embeddings ce 
                     JOIN chunks c ON ce.chunk_id = c.chunk_id 
                     WHERE c.doc_id = %s) as embedding_count,
                    (SELECT COUNT(*) FROM ingest_jobs WHERE doc_id = %s) as job_count
                """,
                (doc_id, doc_id, doc_id)
            )
            
            # 删除导入记录
            execute("DELETE FROM ingest_jobs WHERE doc_id = %s", (doc_id,))
            
            # 删除文档(级联删除 chunks 和 embeddings)
            execute("DELETE FROM documents WHERE doc_id = %s", (doc_id,))
            
            result = {
                "success": True,
                "doc_id": doc_id,
                "deleted_chunks": stats["chunk_count"] if stats else 0,
                "deleted_embeddings": stats["embedding_count"] if stats else 0,
                "deleted_jobs": stats["job_count"] if stats else 0,
                "object_deleted": False,
            }
            
            # 可选删除 MinIO 对象
            if also_delete_object and pdf_key:
                delete_result = delete_object(pdf_key)
                result["object_deleted"] = delete_result.get("deleted", False)
                result["pdf_key"] = pdf_key
            
            return result
            
        except Exception as e:
            return {
                "success": False,
                "error": str(e),
                "doc_id": doc_id,
            }
  • Registration of the fetch tools, including delete_document, by calling register_fetch_tools(mcp) in the main MCP server setup.
    from paperlib_mcp.tools.fetch import register_fetch_tools
    from paperlib_mcp.tools.writing import register_writing_tools
    
    # M2 GraphRAG 工具
    from paperlib_mcp.tools.graph_extract import register_graph_extract_tools
    from paperlib_mcp.tools.graph_canonicalize import register_graph_canonicalize_tools
    from paperlib_mcp.tools.graph_community import register_graph_community_tools
    from paperlib_mcp.tools.graph_summarize import register_graph_summarize_tools
    from paperlib_mcp.tools.graph_maintenance import register_graph_maintenance_tools
    
    # M3 Review 工具
    from paperlib_mcp.tools.review import register_review_tools
    
    # M4 Canonicalization & Grouping 工具
    from paperlib_mcp.tools.graph_relation_canonicalize import register_graph_relation_canonicalize_tools
    from paperlib_mcp.tools.graph_claim_grouping import register_graph_claim_grouping_tools
    from paperlib_mcp.tools.graph_v12 import register_graph_v12_tools
    
    register_health_tools(mcp)
    register_import_tools(mcp)
    register_search_tools(mcp)
    register_fetch_tools(mcp)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paperlib-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server