Skip to main content
Glama
u9401066

asset-aware-mcp

by u9401066

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
DATA_DIRNoData directory for documents, assets, and cache../data
LLM_BACKENDNoLLM backend, e.g., 'openrouter'.
UV_CACHE_DIRNouv cache directory, defaults to DATA_DIR/.uv-cache.
ENABLE_LIGHTRAGNoEnable LightRAG knowledge graph backend (boolean as string).false
OPENROUTER_MODELNoModel identifier for OpenRouter, e.g., 'liquid/lfm-2.5-1.2b-instruct:free'.
OPENROUTER_API_KEYNoAPI key for OpenRouter if LLM_BACKEND is 'openrouter'.
ASSET_AWARE_MCP_TOOL_SURFACENoTool surface mode: 'compact', 'legacy', or default 'balanced'.balanced
ASSET_AWARE_MCP_ENABLE_LEGACY_TOOLSNoEnable legacy tools (boolean as string).false

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
parse_pdf_structureC

Create a background Marker parse job for a PDF.

find_evidence_spansC

Search citation-ready evidence spans with exact locator metadata.

Returns span-level AssetRef JSON that can be passed to table_cite.

verify_citation_refC

Verify a span-level AssetRef against the persisted citation index.

citation_bundleB

Export citation-ready evidence spans as a verified bundle.

Each entry carries AssetRef, exact quote/hash, locator metadata, context, conservative CRAAP scaffold, optional verification status, and Foam anchor metadata. Use output_format="foam" for a Foam-compatible evidence pack. Pass wiki_root to write the pack and optionally update an index note.

ingest_documentsB

Process PDF files and create Document Manifests.

ETL Pipeline:

  1. Extract text (to markdown) and images

  2. Generate structured Document Manifest

  3. Index in LightRAG (if enabled)

Args: file_paths: List of absolute paths to PDF files async_mode: Kept for backwards compatibility. PDF ingestion is routed to a background job from the MCP tool layer to keep stdio clients responsive. use_marker: If True, use Marker for structured parsing (slower but more accurate). Produces blocks.json with bbox/coordinates for precise source tracking. Default False uses PyMuPDF (faster but less structured). marker_max_pages_per_chunk: When using Marker, split PDFs into fixed-size page chunks. Set 0 to use the safe automatic strategy. extract_figures: When using Marker, control whether image crops are extracted and saved. Disable this first for image-heavy textbooks to reduce memory pressure. page_ranges: 1-indexed inclusive page ranges applied to every input file, e.g. ["1-50", "120-160"].

Returns: Job ID for tracking progress with get_job_status.

Example: # Async (recommended for large files): ingest_documents(["/papers/study1.pdf"]) # Then check status: get_job_status("job_xxx")

# With Marker for precise source tracking:
ingest_documents(["/papers/textbook.pdf"], use_marker=True)
list_documentsA

List all processed documents with summaries.

Returns: List of documents with doc_id, title, and asset counts

fetch_document_assetA

Fetch specific content from a document with precision.

Asset Types:

  • "table": Returns table as markdown (with page number)

  • "figure": Returns image as base64 with page number for verification

  • "section": Returns section text content

  • "full_text": Returns entire document as markdown

Args: doc_id: Document identifier asset_type: One of "table", "figure", "section", "full_text" asset_id: Asset ID from manifest (e.g., "tab_1", "fig_1_1", "sec_methods") Use "full" for full_text type max_size: Maximum image dimension (longest edge) for figures. - None (default): Use default 1024px - 0: Return original size (no resize) - N: Resize to Npx longest edge (e.g., 512, 768, 2048)

Returns: For figures: ImageContent that vision AI can directly analyze For others: TextContent in markdown format

Example: # Get Table 1 from document fetch_document_asset("abc123", "table", "tab_1")

# Get figure with default resize (1024px)
fetch_document_asset("abc123", "figure", "fig_2_1")

# Get figure at specific size (512px for smaller context)
fetch_document_asset("abc123", "figure", "fig_2_1", max_size=512)

# Get original image (no resize)
fetch_document_asset("abc123", "figure", "fig_2_1", max_size=0)
documentD

Consolidated PDF document entrypoint.

Existing document tools stay registered and keep their original contracts.

document_assetD

Consolidated document asset and section entrypoint.

evidenceD

Consolidated citation evidence entrypoint.

convert_documentC

Consolidated document conversion entrypoint.

Dispatches to the existing conversion tools so each source family keeps its established output-path containment policy.

ingest_docxA

攝入 .docx / .doc 文件,轉換為 DFM (Docx-Flavored Markdown) 格式。

將 docx 解析為中間表示 (IR),再轉換為可在 VS Code 中編輯的 DFM 格式。 支援複雜元素:合併表格、圖表、頁首頁尾、巨集、目錄等。 支援舊版 .doc 格式(自動透過 LibreOffice 轉換為 .docx)。

輸出目錄結構:

data/{doc_id}/
├── content.dfm     # 可編輯的 Markdown + YAML 標注
├── ir.json          # IR 快照(用於回寫)
├── original.docx    # 原始檔案備份
├── parts/           # 保留的 XML 零件
└── assets/          # 圖片和二進位資產

Args: file_path: .docx 或 .doc 檔案的絕對路徑

Returns: 攝入結果摘要(doc_id、區塊數量等)

get_docx_contentA

取得 docx 文件的可編輯 DFM 內容。

若指定 block_id,只回傳該區塊的內容;否則回傳完整 DFM。

Args: doc_id: 文件 ID(由 ingest_docx 產生) block_id: 可選,特定區塊 ID(如 p001, t001, h001)

Returns: DFM 內容或特定區塊資訊

save_docxA

將編輯後的內容存回 .docx 檔案。

支援兩種模式:

  • DFM 模式(預設):傳入 dfm_content(.dfm 格式全文)

  • MD 模式(from_md=True):從磁碟讀取 content.md + format.yaml

回寫流程:

  1. 解析 DFM/MD → 提取修改

  2. 載入原始 IR

  3. 合併修改(格式合併策略)

  4. 重建 .docx

安全機制:若內容萎縮 > 50%,預設拒絕輸出(疑似資料遺失)。 使用 force=True 強制輸出。

若 track_changes=True,會將 DFM 中的文字修改以真正 Word Track Changes (w:del/w:ins) 寫回,供使用者在 Word 中逐項審查。

Args: doc_id: 文件 ID dfm_content: 編輯後的 DFM 全文(from_md=True 時可省略) output_path: 輸出路徑(預設為 data/{doc_id}/output.docx) from_md: 若為 True,從磁碟讀取 content.md + format.yaml 而非使用 dfm_content force: 若為 True,即使偵測到嚴重內容萎縮仍強制輸出 track_changes: 若為 True,以 Word Track Changes 寫入文字 diff revision_author: 產生追蹤修訂時使用的作者名稱

Returns: 儲存結果

docxD

Consolidated DOCX/DFM entrypoint.

The legacy DOCX tools remain registered and keep their original parameters and output formats. This wrapper only adds an operation-based facade.

docx_table_edit_planA

Plan a DOCX table write-back before applying structural changes.

The current DFM bridge is safest for same-shape cell text updates. This plan separates safe cell updates from row/column/header structural changes so the caller can review risk before docx_table_from_context.

docx_tableD

Consolidated DOCX table bridge entrypoint.

Existing docx_table_* tools remain available for clients that rely on their names or generated allow-lists.

get_job_statusA

Get the status of an ETL job.

Use this to check progress of document ingestion started with ingest_documents.

Args: job_id: Job ID returned from ingest_documents

Returns: Job status including progress, phase, and result (if completed)

Example: get_job_status("job_20251226_143000_abc12345")

list_jobsA

List ETL jobs.

Args: active_only: If True, only show pending/processing jobs

Returns: List of jobs with status and progress

jobD

Consolidated job entrypoint over get/list/cancel.

Existing job tools stay registered for backwards compatibility.

knowledgeD

Consolidated knowledge-graph entrypoint.

Existing consult/export tools remain registered and keep their separate contracts for clients that prefer explicit tool names.

etl_profileD

Consolidated ETL profile entrypoint.

Existing profile tools stay registered for backwards compatibility.

sectionD

Consolidated section navigation entrypoint.

plan_tableB

📋 表格規劃工具:Schema 設計、模板查詢、模板建表。

Operations:

  • schema: 根據問題自動規劃表格結構

  • templates: 列出內建表格模板

  • from_template: 從模板快速建立表格

Args: operation: 操作類型 question: [schema] 使用者問題 doc_ids: [schema] 相關文件 ID hints: [schema] 結構提示 template_name: [from_template] 模板名稱 title_override: [from_template] 自訂標題

Examples: plan_table("schema", question="比較三種藥物副作用") plan_table("templates") plan_table("from_template", template_name="drug_comparison")

table_manageC

📊 表格管理工具:建立、刪除、列表、預覽、渲染、Schema 演進。

Operations:

  • create: 建立新表格

  • delete: 刪除表格

  • list: 列出所有表格

  • preview: Markdown 預覽

  • resume: 恢復工作(Token-efficient)

  • render: 渲染為 Excel/Markdown

  • add_column: 新增欄位

  • remove_column: 移除欄位

  • rename_column: 重新命名欄位

Args: operation: 操作類型 intent: [create] comparison / citation / summary title: [create] 表格標題 columns: [create] 欄位列表 [{"name":"Drug","type":"text"}] source_description: [create] 資料來源 table_id: [大部分操作] 表格 ID limit: [preview] 預覽行數 format: [render] 輸出格式 filename: [render] 檔案名稱 column_name: [add/remove/rename_column] 欄位名 column_type: [add_column] 欄位類型 required: [add_column] 是否必填 default_value: [add_column] 預設值 enum_values: [add_column] enum 可選值 new_name: [rename_column] 新欄位名

Examples: table_manage("create", intent="comparison", title="Drug Compare", columns=[{"name":"Drug","type":"text"}]) table_manage("list") table_manage("preview", table_id="tbl_xxx") table_manage("add_column", table_id="tbl_xxx", column_name="Route", column_type="enum", enum_values=["IV","IM"])

table_dataC

📝 表格資料操作:新增/取得/更新/刪除 列 & 儲存格。

Operations:

  • add_rows: 批次新增資料列

  • get_row: 取得單列(含引用)

  • update_row: 整列更新

  • delete_row: 刪除列

  • get_cell: 取得單格(含引用)

  • update_cell: 更新單格

  • clear_cell: 清除單格

Args: operation: 操作類型 table_id: 表格 ID rows: [add_rows] 資料列列表 row: [update_row] 新的列資料 row_index: [get/update/delete_row, cell ops] 列索引 (0-based) column_name: [cell ops] 欄位名 value: [update_cell] 新的值

Examples: table_data("add_rows", "tbl_xxx", rows=[{"Drug":"A","Dose":1}]) table_data("get_row", "tbl_xxx", row_index=0) table_data("update_cell", "tbl_xxx", row_index=0, column_name="Drug", value="B")

table_citeA

📎 表格引用管理:為儲存格附加、查詢、移除來源引用。

引用是「平行附加層」,不改變表格資料結構。 每個引用用 AssetRef 指向具體來源(PDF section、URL、使用者輸入等)。

Operations:

  • add: 新增引用到儲存格

  • get: 查詢引用(cell / row / table)

  • remove: 移除引用

  • cell_history: 查看儲存格變更歷史

Args: operation: 操作類型 table_id: 表格 ID row_index: 列索引 (0-based),get 時可省略取得全表引用 column_name: 欄位名,get 時可省略取得整列引用 refs: [add] 引用列表,每項為 AssetRef dict [{"source_type":"section","doc_id":"doc_xxx","asset_id":"sec_01","excerpt":"..."}] [{"source_type":"external","url":"https://doi.org/...","label":"Smith 2024"}] [{"source_type":"user_input","excerpt":"Patient reported"}] confidence: [add] Agent 信心度 0.0~1.0 notes: [add] 備註 ref_index: [remove] 移除特定引用索引(不指定則移除整個 cell 引用)

Examples: table_cite("add", "tbl_xxx", row_index=0, column_name="Drug", refs=[{"source_type":"section","doc_id":"doc_a","asset_id":"sec_01", "excerpt":"dose was 5mg"}], confidence=0.9) table_cite("get", "tbl_xxx") # 全表引用 table_cite("get", "tbl_xxx", row_index=0) # 全列引用 table_cite("cell_history", "tbl_xxx", row_index=0, column_name="Drug")

table_historyA

📜 表格歷史與統計:變更紀錄、Token 估算。

Operations:

  • changes: 查看表格變更歷史

  • tokens: 估算 Token 消耗

Args: operation: 操作類型 table_id: 表格 ID [changes, tokens] limit: [changes] 最近 N 筆 draft_id: [tokens] 草稿 ID(可選) text: [tokens] 任意文字(可選)

Examples: table_history("changes", "tbl_xxx") table_history("tokens", "tbl_xxx")

table_draftA

📝 草稿工作流:建立、更新、新增資料、恢復、提交。

草稿會自動保存,即使對話中斷也能恢復。 適合長時間的表格建立流程。

Operations:

  • create: 建立草稿

  • update: 更新草稿

  • add_rows: 批次新增資料到草稿

  • resume: 恢復草稿(Token-efficient)

  • commit: 草稿轉正式表格

  • list: 列出草稿

  • delete: 刪除草稿

Args: operation: 操作類型 draft_id: [大部分操作] 草稿 ID title: [create/update] 標題 intent: [create/update] comparison / citation / summary proposed_columns: [create/update] 欄位定義 extraction_plan: [create/update] 抽取計畫 source_doc_ids: [create/update] 來源文件 ID source_sections: [create/update] 來源章節 ID notes: [create/update] 工作筆記 rows: [add_rows] 資料列

Examples: table_draft("create", title="Drug Comparison", intent="comparison", proposed_columns=[{"name":"Drug","type":"text"}]) table_draft("add_rows", draft_id="draft_xxx", rows=[{"Drug":"A"}]) table_draft("commit", draft_id="draft_xxx")

discover_sourcesA

🔍 資料來源探索:跨文件搜尋可用於表格的資料來源。

整合 Section、Figure、Table、Knowledge Graph 多個資料庫, 返回統一的 AssetRef 格式結果,可直接用於 table_cite。

Args: query: 搜尋關鍵字 doc_ids: 限定搜尋的文件(不指定則搜尋所有文件) include_kg: 是否包含知識圖譜搜尋 limit: 每類來源的最大結果數

Returns: 發現的資料來源(AssetRef 格式)

Example: discover_sources("remimazolam dosing") discover_sources("drug comparison", doc_ids=["doc_abc"])

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription
resource_document_listDynamic resource listing all processed documents.
resource_knowledge_graph_summary Dynamic resource showing knowledge graph statistics. Provides an overview of the indexed knowledge including: - Total nodes and edges - Entity type distribution - Sample entities and relationships
resource_table_listDynamic resource listing all A2T tables.
resource_draft_listDynamic resource listing all A2T drafts.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/u9401066/asset-aware-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server