read_documentation
Retrieve AWS documentation pages and convert them to markdown format for easier reading and integration. Supports chunked reading for long documents.
Instructions
Fetch and convert an AWS documentation page to markdown format.
Usage
This tool retrieves the content of an AWS documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks.
URL Requirements
Must be from the docs.aws.amazon.com domain
Must end with .html
Example URLs
https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html
https://docs.aws.amazon.com/lambda/latest/dg/lambda-invocation.html
Output Format
The output is formatted as markdown text with:
Preserved headings and structure
Code blocks for examples
Lists and tables converted to markdown format
Handling Long Documents
If the response indicates the document was truncated, you have several options:
Continue Reading: Make another call with start_index set to the end of the previous response
Stop Early: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading
Args: ctx: MCP context for logging and error handling url: URL of the AWS documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index
Returns: Markdown content of the AWS documentation
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL of the AWS documentation page to read | |
| max_length | No | Maximum number of characters to return. | |
| start_index | No | On return output starting at this character index, useful if a previous fetch was truncated and more content is required. |
Implementation Reference
- Core handler implementation that fetches the documentation page, handles errors, extracts content from HTML if needed, formats the result with pagination support, and returns markdown content.async def read_documentation_impl( ctx: Context, url_str: str, max_length: int, start_index: int, session_uuid: str, ) -> str: """The implementation of the read_documentation tool.""" logger.debug(f'Fetching documentation from {url_str}') url_with_session = f'{url_str}?session={session_uuid}' async with httpx.AsyncClient() as client: try: response = await client.get( url_with_session, follow_redirects=True, headers={ 'User-Agent': DEFAULT_USER_AGENT, 'X-MCP-Session-Id': session_uuid, }, timeout=30, ) except httpx.HTTPError as e: error_msg = f'Failed to fetch {url_str}: {str(e)}' logger.error(error_msg) await ctx.error(error_msg) return error_msg if response.status_code >= 400: error_msg = f'Failed to fetch {url_str} - status code {response.status_code}' logger.error(error_msg) await ctx.error(error_msg) return error_msg page_raw = response.text content_type = response.headers.get('content-type', '') if is_html_content(page_raw, content_type): content = extract_content_from_html(page_raw) else: content = page_raw result = format_documentation_result(url_str, content, start_index, max_length) # Log if content was truncated if len(content) > start_index + max_length: logger.debug( f'Content truncated at {start_index + max_length} of {len(content)} characters' ) return result
- awslabs/aws_documentation_mcp_server/server_aws.py:77-144 (registration)Registration and handler for the read_documentation tool in the global AWS documentation server. Includes schema validation for URL (docs.aws.amazon.com, .html), parameters with Pydantic Fields, and delegation to the core impl.@mcp.tool() async def read_documentation( ctx: Context, url: str = Field(description='URL of the AWS documentation page to read'), max_length: int = Field( default=5000, description='Maximum number of characters to return.', gt=0, lt=1000000, ), start_index: int = Field( default=0, description='On return output starting at this character index, useful if a previous fetch was truncated and more content is required.', ge=0, ), ) -> str: """Fetch and convert an AWS documentation page to markdown format. ## Usage This tool retrieves the content of an AWS documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks. ## URL Requirements - Must be from the docs.aws.amazon.com domain - Must end with .html ## Example URLs - https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html - https://docs.aws.amazon.com/lambda/latest/dg/lambda-invocation.html ## Output Format The output is formatted as markdown text with: - Preserved headings and structure - Code blocks for examples - Lists and tables converted to markdown format ## Handling Long Documents If the response indicates the document was truncated, you have several options: 1. **Continue Reading**: Make another call with start_index set to the end of the previous response 2. **Stop Early**: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading Args: ctx: MCP context for logging and error handling url: URL of the AWS documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index Returns: Markdown content of the AWS documentation """ # Validate that URL is from docs.aws.amazon.com and ends with .html url_str = str(url) if not re.match(r'^https?://docs\.aws\.amazon\.com/', url_str): await ctx.error(f'Invalid URL: {url_str}. URL must be from the docs.aws.amazon.com domain') raise ValueError('URL must be from the docs.aws.amazon.com domain') if not url_str.endswith('.html'): await ctx.error(f'Invalid URL: {url_str}. URL must end with .html') raise ValueError('URL must end with .html') return await read_documentation_impl(ctx, url_str, max_length, start_index, SESSION_UUID)
- awslabs/aws_documentation_mcp_server/server_aws_cn.py:66-135 (registration)Registration and handler for the read_documentation tool in the AWS China documentation server. Includes schema validation for URL (docs.amazonaws.cn, .html), parameters with Pydantic Fields, and delegation to the core impl.@mcp.tool() async def read_documentation( ctx: Context, url: Union[AnyUrl, str] = Field(description='URL of the AWS China documentation page to read'), max_length: int = Field( default=5000, description='Maximum number of characters to return.', gt=0, lt=1000000, ), start_index: int = Field( default=0, description='On return output starting at this character index, useful if a previous fetch was truncated and more content is required.', ge=0, ), ) -> str: """Fetch and convert an AWS China documentation page to markdown format. ## Usage This tool retrieves the content of an AWS China documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks. ## URL Requirements - Must be from the docs.amazonaws.cn domain - Must end with .html ## Example URLs - https://docs.amazonaws.cn/en_us/AmazonS3/latest/userguide/bucketnamingrules.html - https://docs.amazonaws.cn/en_us/lambda/latest/dg/lambda-invocation.html ## Output Format The output is formatted as markdown text with: - Preserved headings and structure - Code blocks for examples - Lists and tables converted to markdown format ## Handling Long Documents If the response indicates the document was truncated, you have several options: 1. **Continue Reading**: Make another call with start_index set to the end of the previous response 2. **Stop Early**: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading Args: ctx: MCP context for logging and error handling url: URL of the AWS China documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index Returns: Markdown content of the AWS China documentation """ # Validate that URL is from docs.amazonaws.cn and ends with .html url_str = str(url) if not re.match(r'^https?://docs\.amazonaws\.cn/', url_str): error_msg = f'Invalid URL: {url_str}. URL must be from the docs.amazonaws.cn domain' await ctx.error(error_msg) return error_msg if not url_str.endswith('.html'): error_msg = f'Invalid URL: {url_str}. URL must end with .html' await ctx.error(error_msg) return error_msg return await read_documentation_impl(ctx, url_str, max_length, start_index, SESSION_UUID)
- Pydantic schema definitions for input parameters of read_documentation tool using Field for validation and descriptions.ctx: Context, url: str = Field(description='URL of the AWS documentation page to read'), max_length: int = Field( default=5000, description='Maximum number of characters to return.', gt=0, lt=1000000, ), start_index: int = Field( default=0, description='On return output starting at this character index, useful if a previous fetch was truncated and more content is required.', ge=0, ), ) -> str: