MCP 도클링 서버

Docling 라이브러리를 사용하여 문서 처리 기능을 제공하는 MCP 서버입니다.

설치

pip를 사용하여 패키지를 설치할 수 있습니다.

지엑스피1

Related MCP server: MarkItDown MCP Server

용법

stdio(기본값) 또는 SSE 전송을 사용하여 서버를 시작합니다.

# Using stdio transport (default) mcp-server-lls # Using SSE transport on custom port mcp-server-lls --transport sse --port 8000

uv를 사용하는 경우 설치하지 않고도 서버를 직접 실행할 수 있습니다.

# Using stdio transport (default) uv run mcp-server-lls # Using SSE transport on custom port uv run mcp-server-lls --transport sse --port 8000

사용 가능한 도구

서버는 다음 도구를 제공합니다.

convert_document : URL 또는 로컬 경로의 문서를 마크다운 형식으로 변환합니다.
- source : 문서의 URL 또는 로컬 파일 경로(필수)
- enable_ocr : 스캔된 문서에 대해 OCR을 활성화할지 여부(선택 사항, 기본값: false)
- ocr_language : OCR에 대한 언어 코드 목록, 예: ["en", "fr"] (선택 사항)
convert_document_with_images : 문서를 변환하고 내장된 이미지를 추출합니다.
- source : 문서의 URL 또는 로컬 파일 경로(필수)
- enable_ocr : 스캔된 문서에 대해 OCR을 활성화할지 여부(선택 사항, 기본값: false)
- ocr_language : OCR 언어 코드 목록(선택 사항)
extract_tables : 문서에서 테이블을 구조화된 데이터로 추출합니다.
- source : 문서의 URL 또는 로컬 파일 경로(필수)
convert_batch : 일괄 모드로 여러 문서를 처리합니다.
- sources : 문서의 URL 또는 파일 경로 목록(필수)
- enable_ocr : 스캔된 문서에 대해 OCR을 활성화할지 여부(선택 사항, 기본값: false)
- ocr_language : OCR 언어 코드 목록(선택 사항)
qna_from_document : URL 또는 로컬 경로에서 YAML 형식으로 Q&A 문서를 만듭니다.
- source : 문서의 URL 또는 로컬 파일 경로(필수)
- no_of_qnas : 예상 Q&A 수(선택 사항, 기본값: 5)
- 참고 : 이 도구를 사용하려면 IBM Watson X 자격 증명을 환경 변수로 설정해야 합니다.
  - WATSONX_PROJECT_ID : Watson X 프로젝트 ID
  - WATSONX_APIKEY : IBM Cloud API 키
  - WATSONX_URL : Watson X API URL(기본값: https://us-south.ml.cloud.ibm.com )
get_system_info : 시스템 구성 및 가속 상태에 대한 정보를 가져옵니다.

라마 스택의 예

https://github.com/user-attachments/assets/8ad34e50-cbf7-4ec8-aedd-71c42a5de0a1

이 서버를 Llama Stack 과 함께 사용하면 LLM 애플리케이션에 문서 처리 기능을 제공할 수 있습니다. Llama Stack 서버가 실행 중인지 확인한 후 INFERENCE_MODEL 구성하세요.

from llama_stack_client.lib.agents.agent import Agent from llama_stack_client.lib.agents.event_logger import EventLogger from llama_stack_client.types.agent_create_params import AgentConfig from llama_stack_client.types.shared_params.url import URL from llama_stack_client import LlamaStackClient import os # Set your model ID model_id = os.environ["INFERENCE_MODEL"] client = LlamaStackClient( base_url=f"http://localhost:{os.environ.get('LLAMA_STACK_PORT', '8080')}" ) # Register MCP tools client.toolgroups.register( toolgroup_id="mcp::docling", provider_id="model-context-protocol", mcp_endpoint=URL(uri="http://0.0.0.0:8000/sse")) # Define an agent with MCP toolgroup agent_config = AgentConfig( model=model_id, instructions="""You are a helpful assistant with access to tools to manipulate documents. Always use the appropriate tool when asked to process documents.""", toolgroups=["mcp::docling"], tool_choice="auto", max_tool_calls=3, ) # Create the agent agent = Agent(client, agent_config) # Create a session session_id = agent.create_session("test-session") def _summary_and_qna(source: str): # Define the prompt run_turn(f"Please convert the document at {source} to markdown and summarize its content.") run_turn(f"Please generate a Q&A document with 3 items for source at {source} and display it in YAML format.") def _run_turn(prompt): # Create a turn response = agent.create_turn( messages=[ { "role": "user", "content": prompt, } ], session_id=session_id, ) # Log the response for log in EventLogger().log(response): log.print() _summary_and_qna('https://arxiv.org/pdf/2004.07606')