Skip to main content
Glama
CONTENT-SIZE-LIMITS.md6.54 kB
# Content Size Limits ## Overview To ensure optimal performance and prevent token overflow in LLM interactions, the server implements intelligent content size limits for SAP Help Portal and Community content retrieval. ## Configuration ### Maximum Content Length **Default: 75,000 characters** (~18,750 tokens) This limit is configurable in `/src/lib/config.ts`: ```typescript export const CONFIG = { // Maximum content length for SAP Help and Community full content retrieval // Limits help prevent token overflow and keep responses manageable (~18,750 tokens) MAX_CONTENT_LENGTH: 75000, // 75,000 characters }; ``` ## Affected Tools The content size limit applies to the following MCP tools: ### 1. `sap_help_get` Retrieves full SAP Help Portal pages. If content exceeds 75,000 characters, it is intelligently truncated while preserving: - Beginning section (introduction and main content) - End section (conclusions and examples) - A clear truncation notice showing what was omitted ### 2. `sap_community_search` Returns full content of top 3 SAP Community posts. Each post is truncated if needed using the same intelligent algorithm. ### 3. Community Post Retrieval Individual community posts fetched via `fetch` tool with `community-*` IDs are also subject to truncation. ## Intelligent Truncation Algorithm When content exceeds the maximum length, the truncation algorithm: ### Preservation Strategy - **60%** from the beginning (introduction, overview, main content) - **20%** from the end (conclusions, examples, summaries) - **20%** reserved for truncation notice and natural break padding ### Natural Boundaries The algorithm attempts to break content at natural points rather than mid-sentence: 1. Paragraph breaks (`\n\n`) 2. Markdown headings (`# Heading`) 3. Code block boundaries (` ```\n`) 4. Horizontal rules (`---`) 5. Sentence endings (`. `) ### Truncation Notice A clear notice is inserted showing: - Original content length in characters - Approximate original token count (chars ÷ 4) - Number of omitted characters - Percentage of content omitted - Explanation that beginning and end are preserved Example truncation notice: ```markdown --- ⚠️ **Content Truncated** The full content was 425,000 characters (approximately 106,250 tokens). For readability and performance, 350,000 characters (82%) have been omitted from the middle section. The beginning and end of the document are preserved above and below this notice. --- ``` ## Rationale ### Why 75,000 Characters? 1. **LLM Context Windows**: Fits comfortably in most modern LLM context windows: - Claude 3.5 Sonnet: 200k tokens (can handle ~800k chars) - GPT-4 Turbo: 128k tokens (can handle ~512k chars) - Leaves room for conversation history and system prompts 2. **Performance**: Reduces response time and API costs while maintaining comprehensive coverage 3. **Readability**: Very long documents (>100k chars) are often better consumed in multiple focused queries 4. **Practical Coverage**: 75k characters is sufficient for most documentation pages while preventing extreme cases ### Alternative Approaches Considered | Approach | Characters | Tokens (approx) | Trade-off | |----------|-----------|-----------------|-----------| | Conservative | 50,000 | ~12,500 | Too restrictive for comprehensive docs | | **Current** | **75,000** | **~18,750** | **Balanced - recommended** | | Generous | 100,000 | ~25,000 | Risk of slow responses | | Maximum | 150,000 | ~37,500 | Only for edge cases | ## Implementation Details ### Source Files - **Configuration**: `/src/lib/config.ts` - MAX_CONTENT_LENGTH constant - **Truncation Logic**: `/src/lib/truncate.ts` - Intelligent truncation implementation - **SAP Help**: `/src/lib/sapHelp.ts` - Applied in `getSapHelpContent()` - **Community**: `/src/lib/communityBestMatch.ts` - Applied in post retrieval functions ### Functions #### `truncateContent(content: string, maxLength?: number): TruncationResult` Main truncation function with beginning/end preservation. **Returns:** ```typescript { content: string; // Truncated content wasTruncated: boolean; // Whether truncation occurred originalLength: number; // Original character count truncatedLength: number; // Final character count } ``` #### `truncateContentSimple(content: string, maxLength?: number): TruncationResult` Alternative truncation function that only preserves beginning with end notice. ## Monitoring and Adjustment ### When to Increase Limit Consider increasing if: - Users frequently encounter truncated content - Average document sizes are near the limit - LLM context windows have increased significantly ### When to Decrease Limit Consider decreasing if: - Response times are too slow - Token costs are concerning - Most content doesn't use the available space ### Override for Specific Cases To override the limit for specific use cases, modify the `truncateContent()` call: ```typescript // Custom limit of 100,000 characters const truncationResult = truncateContent(fullContent, 100000); ``` ## User Experience ### Transparent Communication When content is truncated, users see: - Clear visual indicator (⚠️ warning emoji) - Exact statistics (original length, omitted amount, percentage) - Explanation of what's preserved - No disruption to markdown formatting ### Best Practices for Users 1. **Specific Queries**: Ask focused questions to get relevant sections 2. **Multiple Requests**: Break very long documents into targeted fetches 3. **Search First**: Use `sap_help_search` to find specific sections before fetching 4. **Check URLs**: Visit the provided URLs for complete untruncated content ## Future Enhancements Potential improvements to consider: 1. **Dynamic Limits**: Adjust based on LLM context window 2. **Sectioned Retrieval**: Fetch specific document sections 3. **Summary Generation**: Auto-summarize omitted middle sections 4. **User Preferences**: Allow users to specify their preferred limits 5. **Compression**: Apply content compression for technical reference material ## Testing Content size limits are tested in: - Unit tests for truncation functions - Integration tests for SAP Help and Community tools - Manual validation with known large documents ## Related Documentation - **Architecture**: `/docs/ARCHITECTURE.md` - System overview - **Tool Descriptions**: `/docs/CURSOR-SETUP.md` - MCP tool documentation - **Community Search**: `/docs/COMMUNITY-SEARCH-IMPLEMENTATION.md` - Community integration details

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marianfoo/mcp-sap-docs'

If you have feedback or need assistance with the MCP directory API, please join our Discord server