read_documentation
Retrieve and convert AWS documentation pages into markdown format, preserving structure, code blocks, and lists. Handle long documents by fetching content in chunks with precise start indexes.
Instructions
Fetch and convert an AWS documentation page to markdown format.
Usage
This tool retrieves the content of an AWS documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks.
URL Requirements
Must be from the docs.aws.amazon.com domain
Must end with .html
Example URLs
https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html
https://docs.aws.amazon.com/lambda/latest/dg/lambda-invocation.html
Output Format
The output is formatted as markdown text with:
Preserved headings and structure
Code blocks for examples
Lists and tables converted to markdown format
Handling Long Documents
If the response indicates the document was truncated, you have several options:
Continue Reading: Make another call with start_index set to the end of the previous response
Stop Early: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading
Args: ctx: MCP context for logging and error handling url: URL of the AWS documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index
Returns: Markdown content of the AWS documentation
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| max_length | No | Maximum number of characters to return. | |
| start_index | No | On return output starting at this character index, useful if a previous fetch was truncated and more content is required. | |
| url | Yes | URL of the AWS documentation page to read |
Implementation Reference
- Handler for read_documentation tool in AWS China server, includes registration via @mcp.tool(), input schema via Pydantic Fields, validation, and delegation to impl.@mcp.tool() async def read_documentation( ctx: Context, url: Union[AnyUrl, str] = Field(description='URL of the AWS China documentation page to read'), max_length: int = Field( default=5000, description='Maximum number of characters to return.', gt=0, lt=1000000, ), start_index: int = Field( default=0, description='On return output starting at this character index, useful if a previous fetch was truncated and more content is required.', ge=0, ), ) -> str: """Fetch and convert an AWS China documentation page to markdown format. ## Usage This tool retrieves the content of an AWS China documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks. ## URL Requirements - Must be from the docs.amazonaws.cn domain - Must end with .html ## Example URLs - https://docs.amazonaws.cn/en_us/AmazonS3/latest/userguide/bucketnamingrules.html - https://docs.amazonaws.cn/en_us/lambda/latest/dg/lambda-invocation.html ## Output Format The output is formatted as markdown text with: - Preserved headings and structure - Code blocks for examples - Lists and tables converted to markdown format ## Handling Long Documents If the response indicates the document was truncated, you have several options: 1. **Continue Reading**: Make another call with start_index set to the end of the previous response 2. **Stop Early**: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading Args: ctx: MCP context for logging and error handling url: URL of the AWS China documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index Returns: Markdown content of the AWS China documentation """ # Validate that URL is from docs.amazonaws.cn and ends with .html url_str = str(url) if not re.match(r'^https?://docs\.amazonaws\.cn/', url_str): error_msg = f'Invalid URL: {url_str}. URL must be from the docs.amazonaws.cn domain' await ctx.error(error_msg) return error_msg if not url_str.endswith('.html'): error_msg = f'Invalid URL: {url_str}. URL must end with .html' await ctx.error(error_msg) return error_msg return await read_documentation_impl(ctx, url_str, max_length, start_index, SESSION_UUID)
- Handler for read_documentation tool in global AWS server, includes registration via @mcp.tool(), input schema via Pydantic Fields, validation, and delegation to impl.@mcp.tool() async def read_documentation( ctx: Context, url: str = Field(description='URL of the AWS documentation page to read'), max_length: int = Field( default=5000, description='Maximum number of characters to return.', gt=0, lt=1000000, ), start_index: int = Field( default=0, description='On return output starting at this character index, useful if a previous fetch was truncated and more content is required.', ge=0, ), ) -> str: """Fetch and convert an AWS documentation page to markdown format. ## Usage This tool retrieves the content of an AWS documentation page and converts it to markdown format. For long documents, you can make multiple calls with different start_index values to retrieve the entire content in chunks. ## URL Requirements - Must be from the docs.aws.amazon.com domain - Must end with .html ## Example URLs - https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html - https://docs.aws.amazon.com/lambda/latest/dg/lambda-invocation.html ## Output Format The output is formatted as markdown text with: - Preserved headings and structure - Code blocks for examples - Lists and tables converted to markdown format ## Handling Long Documents If the response indicates the document was truncated, you have several options: 1. **Continue Reading**: Make another call with start_index set to the end of the previous response 2. **Stop Early**: For very long documents (>30,000 characters), if you've already found the specific information needed, you can stop reading Args: ctx: MCP context for logging and error handling url: URL of the AWS documentation page to read max_length: Maximum number of characters to return start_index: On return output starting at this character index Returns: Markdown content of the AWS documentation """ # Validate that URL is from docs.aws.amazon.com and ends with .html url_str = str(url) if not re.match(r'^https?://docs\.aws\.amazon\.com/', url_str): await ctx.error(f'Invalid URL: {url_str}. URL must be from the docs.aws.amazon.com domain') raise ValueError('URL must be from the docs.aws.amazon.com domain') if not url_str.endswith('.html'): await ctx.error(f'Invalid URL: {url_str}. URL must end with .html') raise ValueError('URL must end with .html') return await read_documentation_impl(ctx, url_str, max_length, start_index, SESSION_UUID)
- Shared helper function containing the core logic for fetching, extracting, and formatting documentation content.async def read_documentation_impl( ctx: Context, url_str: str, max_length: int, start_index: int, session_uuid: str, ) -> str: """The implementation of the read_documentation tool.""" logger.debug(f'Fetching documentation from {url_str}') url_with_session = f'{url_str}?session={session_uuid}' async with httpx.AsyncClient() as client: try: response = await client.get( url_with_session, follow_redirects=True, headers={ 'User-Agent': DEFAULT_USER_AGENT, 'X-MCP-Session-Id': session_uuid, }, timeout=30, ) except httpx.HTTPError as e: error_msg = f'Failed to fetch {url_str}: {str(e)}' logger.error(error_msg) await ctx.error(error_msg) return error_msg if response.status_code >= 400: error_msg = f'Failed to fetch {url_str} - status code {response.status_code}' logger.error(error_msg) await ctx.error(error_msg) return error_msg page_raw = response.text content_type = response.headers.get('content-type', '') if is_html_content(page_raw, content_type): content = extract_content_from_html(page_raw) else: content = page_raw result = format_documentation_result(url_str, content, start_index, max_length) # Log if content was truncated if len(content) > start_index + max_length: logger.debug( f'Content truncated at {start_index + max_length} of {len(content)} characters' ) return result