NCCN Guidelines MCP Server

Overview Schema Related Servers Score Discussions

get_index

Retrieve the raw YAML index file containing structured NCCN clinical cancer guidelines for direct access to authoritative treatment protocols.

Instructions

Get the raw contents of the NCCN guidelines index YAML file.

Returns:
    String containing the raw YAML content of the guidelines index

Input Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Implementation Reference

server.py:146-166 (handler)

The MCP tool handler for 'get_index'. This function reads and returns the raw YAML content of the NCCN guidelines index file generated by the scraper.

@mcp.tool()
async def get_index() -> str:
    """
    Get the raw contents of the NCCN guidelines index YAML file.
    
    Returns:
        String containing the raw YAML content of the guidelines index
    """
    try:
        index_path = current_dir / GUIDELINES_INDEX_FILE
        with open(index_path, 'r', encoding='utf-8') as f:
            content = f.read()
            logger.info(f"Successfully loaded guidelines index from {index_path}")
            return content
    except FileNotFoundError:
        logger.error(f"Guidelines index file not found: {index_path}")
        return "Error: Guidelines index file not found"
    except Exception as e:
        logger.error(f"Error reading guidelines index: {str(e)}")
        return f"Error reading guidelines index: {str(e)}"

nccn_get_index.py:369-450 (helper)

Helper function that scrapes the NCCN website to generate and maintain the guidelines index YAML file used by the get_index tool. Called during server initialization.

async def ensure_nccn_index(output_file: str = DEFAULT_OUTPUT_FILE, max_age_days: int = CACHE_MAX_AGE_DAYS) -> dict:
    """
    Ensure NCCN guideline index exists and is valid
    This is the main interface for MCP Server calls
    
    Args:
        output_file: Output file path
        max_age_days: Maximum cache file validity period (days)
    
    Returns:
        Parsed guideline index data
    """
    import time
    
    # Check cache file
    cache_info = check_cache_file(output_file)
    
    # Determine if re-scraping is needed
    should_scrape = not cache_info['exists'] or not cache_info['is_valid']
    
    if cache_info['exists']:
        if cache_info['is_valid']:
            logger.info(f"Using valid cache file: {output_file} (created at {cache_info['created_time'].strftime('%Y-%m-%d %H:%M:%S')}, {cache_info['age_days']} days ago)")
        else:
            logger.info(f"Cache file expired ({cache_info['age_days']} days > {max_age_days} days) or corrupted, starting re-scraping...")
    else:
        logger.info("Cache file not found, starting NCCN guideline index scraping...")
    
    if should_scrape:
        start_time = time.time()
        
        try:
            # Scrape all category data
            categories_data = await scrape_all_categories()
            
            if not categories_data:
                logger.error("Scraping failed, no data retrieved")
                # If scraping fails but old cache exists, try using old cache
                if cache_info['exists']:
                    logger.info("Scraping failed, attempting to use existing cache file")
                    return load_cached_data(output_file)
                return {}
            
            # Generate YAML document
            yaml_content = generate_yaml(categories_data)
            
            # Save to file
            with open(output_file, 'w', encoding='utf-8') as f:
                f.write(yaml_content)
            
            # Calculate statistics
            total_guidelines = sum(len(cat.get('items', [])) for cat in categories_data)
            successful_guidelines = sum(
                len([item for item in cat.get('items', []) if item.get('guideline_link')])
                for cat in categories_data
            )
            
            elapsed_time = time.time() - start_time
            
            logger.info(f"Scraping completed! Index saved to {output_file}")
            logger.info(f"Processed {len(categories_data)} categories, found {successful_guidelines}/{total_guidelines} valid guideline links")
            logger.info(f"Scraping time: {elapsed_time:.2f} seconds")
            
        except Exception as e:
            logger.error(f"Error during scraping process: {e}")
            # If scraping fails but cache exists, use cache
            if cache_info['exists']:
                logger.info("Scraping failed, using existing cache file")
                return load_cached_data(output_file)
            return {}
    
    # Load and return data
    cached_data = load_cached_data(output_file)
    if cached_data and 'nccn_guidelines' in cached_data:
        total_categories = len(cached_data['nccn_guidelines'])
        total_guidelines = sum(len(cat.get('guidelines', [])) for cat in cached_data['nccn_guidelines'])
        logger.info(f"NCCN guideline index ready: {total_categories} categories, {total_guidelines} total guidelines")
    else:
        logger.warning("Guideline index file format is abnormal")
    
    return cached_data

server.py:146-146 (registration)
The @mcp.tool() decorator registers the get_index function as an MCP tool.
```
@mcp.tool()
```

server.py:81-85 (helper)

Server initialization call to ensure_nccn_index, which populates the index file before the tool is used.

guidelines_data = await ensure_nccn_index(
    output_file=str(current_dir / GUIDELINES_INDEX_FILE),
    max_age_days=7  # Refresh index every 7 days
)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses the return type (raw YAML content as a string) and that it fetches from a specific file, which is useful context. However, it doesn't mention behavioral traits like error handling, performance, or any constraints (e.g., file size limits, authentication needs). The description adds some value but lacks comprehensive behavioral details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by a clear return statement. Both sentences earn their place by providing essential information without redundancy. It's appropriately sized and structured for a simple tool with no parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (0 parameters, no annotations, but has an output schema), the description is mostly complete. It explains what the tool does and the return value, which is sufficient since the output schema likely covers return details. However, it could benefit from more behavioral context (e.g., any limitations or dependencies) to be fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters, and schema description coverage is 100%, so there are no parameters to document. The description doesn't need to add parameter semantics, and it appropriately focuses on the tool's purpose and output. A baseline of 4 is applied as per the rules for zero parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get the raw contents') and resource ('NCCN guidelines index YAML file'), distinguishing it from sibling tools like download_pdf and extract_content which handle different operations on different resources. It precisely communicates what the tool does without ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by specifying the resource (NCCN guidelines index YAML file), suggesting it's for accessing this particular data. However, it lacks explicit guidance on when to use this tool versus alternatives like extract_content, which might process the content, or download_pdf for PDF files. No exclusions or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gscfwid/NCCN_guidelines_MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server