Skip to main content
Glama

lint_review_v1

Validate academic literature reviews for citation compliance by checking markdown content against specified evidence packages to ensure proper referencing rules are followed.

Instructions

验证全文合规

检查完整综述是否符合所有引用规则。

Args: pack_ids: 允许的证据包 ID 列表(白名单) markdown: 完整的综述 markdown

Returns: passed, issues[], stats

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pack_idsYes
markdownYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The primary handler function for the 'lint_review_v1' tool. It validates citations in the full review markdown by checking if cited chunks exist in the database and are from the whitelisted evidence packs provided via pack_ids. Returns pass/fail status, issues list, and statistics.
    @mcp.tool()
    def lint_review_v1(
        pack_ids: list[int],
        markdown: str,
    ) -> dict[str, Any]:
        """验证全文合规
    
        检查完整综述是否符合所有引用规则。
    
        Args:
            pack_ids: 允许的证据包 ID 列表(白名单)
            markdown: 完整的综述 markdown
    
        Returns:
            passed, issues[], stats
        """
        try:
            # 收集所有允许的 chunk_ids
            all_valid_chunk_ids: set[int] = set()
            pack_chunk_counts: dict[int, int] = {}
    
            for pack_id in pack_ids:
                pack_chunks = query_all(
                    "SELECT chunk_id FROM evidence_pack_items WHERE pack_id = %s",
                    (pack_id,),
                )
                chunk_ids = {row["chunk_id"] for row in pack_chunks}
                all_valid_chunk_ids.update(chunk_ids)
                pack_chunk_counts[pack_id] = len(chunk_ids)
    
            if not all_valid_chunk_ids:
                return {"error": "No valid chunks in provided pack_ids"}
    
            # 解析引用
            citation_pattern = r"\[\[chunk:(\d+)\]\]"
            citations = re.findall(citation_pattern, markdown)
            cited_chunk_ids = [int(c) for c in citations]
    
            issues = []
            valid_citations = 0
            invalid_citations = 0
    
            # 检查每个引用
            for chunk_id in cited_chunk_ids:
                # 检查是否存在
                exists = query_one(
                    "SELECT chunk_id FROM chunks WHERE chunk_id = %s",
                    (chunk_id,),
                )
                if not exists:
                    issues.append({
                        "severity": "error",
                        "rule": "CHUNK_NOT_FOUND",
                        "chunk_id": chunk_id,
                        "message": f"Chunk {chunk_id} does not exist",
                    })
                    invalid_citations += 1
                    continue
    
                # 检查是否在白名单内
                if chunk_id not in all_valid_chunk_ids:
                    issues.append({
                        "severity": "error",
                        "rule": "CHUNK_OUT_OF_PACK",
                        "chunk_id": chunk_id,
                        "message": f"Chunk {chunk_id} is not in whitelisted packs",
                    })
                    invalid_citations += 1
                    continue
    
                valid_citations += 1
    
            # 通过判定
            has_errors = any(issue["severity"] == "error" for issue in issues)
    
            return {
                "passed": not has_errors,
                "issues": issues,
                "stats": {
                    "total_citations": len(cited_chunk_ids),
                    "unique_chunks_cited": len(set(cited_chunk_ids)),
                    "valid_citations": valid_citations,
                    "invalid_citations": invalid_citations,
                    "pack_count": len(pack_ids),
                    "total_allowed_chunks": len(all_valid_chunk_ids),
                    "citation_coverage_pct": (
                        len(set(cited_chunk_ids) & all_valid_chunk_ids) / len(all_valid_chunk_ids) * 100
                        if all_valid_chunk_ids else 0
                    ),
                },
            }
    
        except Exception as e:
            return {"error": str(e), "passed": False}
  • The registration call to register_review_tools(mcp), which defines and registers the lint_review_v1 tool (and other review tools) with the FastMCP instance.
    # 注册 M3 Review 工具
    register_review_tools(mcp)
  • Import of the register_review_tools function from the review tools module, prerequisite for registering the lint_review_v1 tool.
    from paperlib_mcp.tools.review import register_review_tools
  • Docstring providing the tool description, argument descriptions, and return format, which serves as the schema for input/output in FastMCP.
    """验证全文合规
    
    检查完整综述是否符合所有引用规则。
    
    Args:
        pack_ids: 允许的证据包 ID 列表(白名单)
        markdown: 完整的综述 markdown
    
    Returns:
        passed, issues[], stats
    """
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions what the tool does (compliance checking) but lacks details on permissions needed, rate limits, whether it's read-only or mutative, error handling, or what '合规' (compliance) entails beyond citation rules. This is inadequate for a tool with no annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is structured with purpose, parameters, and returns, but could be more front-loaded. The first two lines state the purpose clearly, but the parameter and return explanations are brief yet functional. It avoids redundancy but lacks polish in flow.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 parameters with 0% schema coverage and an output schema present, the description compensates well for parameters but is light on behavioral context. It explains inputs and outputs (passed, issues[], stats), but without annotations, more detail on compliance rules, error cases, or usage scenarios would improve completeness for this validation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description includes an 'Args' section that explains both parameters: 'pack_ids' as a whitelist of allowed evidence pack IDs and 'markdown' as the complete review markdown. With 0% schema description coverage, this adds significant value beyond the bare schema, though it doesn't detail format constraints (e.g., markdown syntax) or pack ID validation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '验证全文合规' (verify full-text compliance) and '检查完整综述是否符合所有引用规则' (check if the complete review complies with all citation rules). It specifies the action (verify/check) and resource (full review/comprehensive review), though it doesn't explicitly differentiate from sibling tools like 'lint_section_v1' which might check sections rather than full reviews.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'lint_section_v1' for section-level checks or other validation tools, nor does it specify prerequisites or appropriate contexts for usage beyond the basic function.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paperlib-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server