read_documentation

Retrieve and convert AWS documentation pages into markdown format, preserving structure, code blocks, and lists. Handle long documents by fetching content in chunks with precise start indexes.

Instructions

Fetch and convert an AWS documentation page to markdown format.

Usage

This tool retrieves the content of an AWS documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks.

URL Requirements

Must be from the docs.aws.amazon.com domain
Must end with .html

Example URLs

https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html
https://docs.aws.amazon.com/lambda/latest/dg/lambda-invocation.html

Output Format

The output is formatted as markdown text with:

Preserved headings and structure
Code blocks for examples
Lists and tables converted to markdown format

Handling Long Documents

If the response indicates the document was truncated, you have several options:

Continue Reading: Make another call with start_index set to the end of the previous response
Stop Early: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading

Args: ctx: MCP context for logging and error handling url: URL of the AWS documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index

Returns: Markdown content of the AWS documentation

Input Schema

TableJSON Schema

Name	Required	Description
`max_length`	No	Maximum number of characters to return.
`start_index`	No	On return output starting at this character index, useful if a previous fetch was truncated and more content is required.
`url`	Yes	URL of the AWS documentation page to read

Implementation Reference

awslabs/aws_documentation_mcp_server/server_aws_cn.py:66-135 (handler)
Handler for read_documentation tool in AWS China server, includes registration via @mcp.tool(), input schema via Pydantic Fields, validation, and delegation to impl.
@mcp.tool() async def read_documentation( ctx: Context, url: Union[AnyUrl, str] = Field(description='URL of the AWS China documentation page to read'), max_length: int = Field( default=5000, description='Maximum number of characters to return.', gt=0, lt=1000000, ), start_index: int = Field( default=0, description='On return output starting at this character index, useful if a previous fetch was truncated and more content is required.', ge=0, ), ) -> str: """Fetch and convert an AWS China documentation page to markdown format. ## Usage This tool retrieves the content of an AWS China documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks. ## URL Requirements - Must be from the docs.amazonaws.cn domain - Must end with .html ## Example URLs - https://docs.amazonaws.cn/en_us/AmazonS3/latest/userguide/bucketnamingrules.html - https://docs.amazonaws.cn/en_us/lambda/latest/dg/lambda-invocation.html ## Output Format The output is formatted as markdown text with: - Preserved headings and structure - Code blocks for examples - Lists and tables converted to markdown format ## Handling Long Documents If the response indicates the document was truncated, you have several options: 1. **Continue Reading**: Make another call with start_index set to the end of the previous response 2. **Stop Early**: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading Args: ctx: MCP context for logging and error handling url: URL of the AWS China documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index Returns: Markdown content of the AWS China documentation """ # Validate that URL is from docs.amazonaws.cn and ends with .html url_str = str(url) if not re.match(r'^https?://docs\.amazonaws\.cn/', url_str): error_msg = f'Invalid URL: {url_str}. URL must be from the docs.amazonaws.cn domain' await ctx.error(error_msg) return error_msg if not url_str.endswith('.html'): error_msg = f'Invalid URL: {url_str}. URL must end with .html' await ctx.error(error_msg) return error_msg return await read_documentation_impl(ctx, url_str, max_length, start_index, SESSION_UUID)
awslabs/aws_documentation_mcp_server/server_aws.py:77-144 (handler)
Handler for read_documentation tool in global AWS server, includes registration via @mcp.tool(), input schema via Pydantic Fields, validation, and delegation to impl.
@mcp.tool() async def read_documentation( ctx: Context, url: str = Field(description='URL of the AWS documentation page to read'), max_length: int = Field( default=5000, description='Maximum number of characters to return.', gt=0, lt=1000000, ), start_index: int = Field( default=0, description='On return output starting at this character index, useful if a previous fetch was truncated and more content is required.', ge=0, ), ) -> str: """Fetch and convert an AWS documentation page to markdown format. ## Usage This tool retrieves the content of an AWS documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks. ## URL Requirements - Must be from the docs.aws.amazon.com domain - Must end with .html ## Example URLs - https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html - https://docs.aws.amazon.com/lambda/latest/dg/lambda-invocation.html ## Output Format The output is formatted as markdown text with: - Preserved headings and structure - Code blocks for examples - Lists and tables converted to markdown format ## Handling Long Documents If the response indicates the document was truncated, you have several options: 1. **Continue Reading**: Make another call with start_index set to the end of the previous response 2. **Stop Early**: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading Args: ctx: MCP context for logging and error handling url: URL of the AWS documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index Returns: Markdown content of the AWS documentation """ # Validate that URL is from docs.aws.amazon.com and ends with .html url_str = str(url) if not re.match(r'^https?://docs\.aws\.amazon\.com/', url_str): await ctx.error(f'Invalid URL: {url_str}. URL must be from the docs.aws.amazon.com domain') raise ValueError('URL must be from the docs.aws.amazon.com domain') if not url_str.endswith('.html'): await ctx.error(f'Invalid URL: {url_str}. URL must end with .html') raise ValueError('URL must end with .html') return await read_documentation_impl(ctx, url_str, max_length, start_index, SESSION_UUID)
awslabs/aws_documentation_mcp_server/server_utils.py:33-85 (helper)
Shared helper function containing the core logic for fetching, extracting, and formatting documentation content.
async def read_documentation_impl( ctx: Context, url_str: str, max_length: int, start_index: int, session_uuid: str, ) -> str: """The implementation of the read_documentation tool.""" logger.debug(f'Fetching documentation from {url_str}') url_with_session = f'{url_str}?session={session_uuid}' async with httpx.AsyncClient() as client: try: response = await client.get( url_with_session, follow_redirects=True, headers={ 'User-Agent': DEFAULT_USER_AGENT, 'X-MCP-Session-Id': session_uuid, }, timeout=30, ) except httpx.HTTPError as e: error_msg = f'Failed to fetch {url_str}: {str(e)}' logger.error(error_msg) await ctx.error(error_msg) return error_msg if response.status_code >= 400: error_msg = f'Failed to fetch {url_str} - status code {response.status_code}' logger.error(error_msg) await ctx.error(error_msg) return error_msg page_raw = response.text content_type = response.headers.get('content-type', '') if is_html_content(page_raw, content_type): content = extract_content_from_html(page_raw) else: content = page_raw result = format_documentation_result(url_str, content, start_index, max_length) # Log if content was truncated if len(content) > start_index + max_length: logger.debug( f'Content truncated at {start_index + max_length} of {len(content)} characters' ) return result

AWS Documentation MCP Server