get_chunk
Retrieve complete text content and metadata for a specific text chunk by providing its unique identifier. Get full text, page ranges, and document information to access segmented academic content.
Instructions
获取指定 chunk 的完整内容
根据 chunk_id 获取文本块的完整信息,包括全文、页码、所属文档等。
Args: chunk_id: chunk 的唯一标识符
Returns: chunk 的详细信息,包含: - chunk_id: chunk ID - doc_id: 所属文档 ID - text: 完整文本 - page_start/page_end: 页码范围 - has_embedding: 是否有 embedding
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| chunk_id | Yes |
Implementation Reference
- src/paperlib_mcp/tools/fetch.py:55-116 (handler)The handler function for the 'get_chunk' tool, decorated with @mcp.tool() for automatic registration. It retrieves chunk details from the database using a SQL query and returns a structured dictionary via the ChunkDetail Pydantic model.@mcp.tool() def get_chunk(chunk_id: int) -> dict[str, Any]: """获取指定 chunk 的完整内容 根据 chunk_id 获取文本块的完整信息,包括全文、页码、所属文档等。 Args: chunk_id: chunk 的唯一标识符 Returns: chunk 的详细信息,包含: - chunk_id: chunk ID - doc_id: 所属文档 ID - text: 完整文本 - page_start/page_end: 页码范围 - has_embedding: 是否有 embedding """ try: # 查询 chunk 信息 chunk = query_one( """ SELECT c.chunk_id, c.doc_id, c.chunk_index, c.section, c.page_start, c.page_end, c.text, c.token_count, CASE WHEN ce.chunk_id IS NOT NULL THEN true ELSE false END as has_embedding FROM chunks c LEFT JOIN chunk_embeddings ce ON c.chunk_id = ce.chunk_id WHERE c.chunk_id = %s """, (chunk_id,) ) if not chunk: return { "error": f"Chunk not found: {chunk_id}", "chunk_id": chunk_id, } return ChunkDetail( chunk_id=chunk["chunk_id"], doc_id=chunk["doc_id"], chunk_index=chunk["chunk_index"], section=chunk["section"], page_start=chunk["page_start"], page_end=chunk["page_end"], text=chunk["text"], token_count=chunk["token_count"], has_embedding=chunk["has_embedding"], ).model_dump() except Exception as e: return { "error": str(e), "chunk_id": chunk_id, }
- Pydantic BaseModel defining the output schema/structure for the get_chunk tool response.class ChunkDetail(BaseModel): """Chunk 详细信息""" chunk_id: int doc_id: str chunk_index: int section: str | None page_start: int page_end: int text: str token_count: int | None has_embedding: bool
- src/paperlib_mcp/server.py:36-36 (registration)Invocation of register_fetch_tools(mcp) in the main MCP server setup, which registers the get_chunk tool (and other fetch tools).register_fetch_tools(mcp)
- src/paperlib_mcp/tools/fetch.py:52-52 (registration)The registration function that defines and registers the get_chunk tool using @mcp.tool() decorator.def register_fetch_tools(mcp: FastMCP) -> None: