Skip to main content
Glama
DynamicEndpoints

Document Extractor MCP Server

extract_document

Extract content from Microsoft Learn or GitHub URLs and store it in PocketBase for organized retrieval and full-text search.

Instructions

Extract document content from Microsoft Learn or GitHub URL and store in PocketBase

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesMicrosoft Learn or GitHub URL to extract content from

Implementation Reference

  • The 'extract_document' tool handler registers the tool, defines its input schema (zod), and executes the logic to extract and store content from Microsoft Learn or GitHub URLs.
    server.tool(
      'extract_document',
      'Extract document content from Microsoft Learn or GitHub URL and store in PocketBase',
      {
        url: z.string().url('Invalid URL format').describe('Microsoft Learn or GitHub URL to extract content from')
      },
      async ({ url }) => {
        try {
          await authenticateWhenNeeded();
          
          let docData;
          if (url.includes('learn.microsoft.com')) {
            docData = await extractFromMicrosoftLearn(url);
          } else if (url.includes('github.com') || url.includes('raw.githubusercontent.com')) {
            docData = await extractFromGitHub(url);
          } else {
            throw new Error('Unsupported URL. Only Microsoft Learn and GitHub URLs are supported.');
          }
          
          const record = await storeDocument(docData);
          
          return {
            content: [{
              type: 'text',
              text: `${record.isUpdate ? '🔄 Document updated' : '✅ Document extracted and stored'} successfully!\n\n` +
                    `**Title:** ${record.title}\n` +
                    `**ID:** ${record.id}\n` +
                    `**Source:** ${docData.metadata.source}\n` +
                    `**URL:** ${docData.metadata.url}\n` +
                    `**Word Count:** ${docData.metadata.wordCount}\n` +
                    `**Content Preview:** ${docData.content.substring(0, 200)}...`
            }]
          };
        } catch (error) {
          return toolErrorHandler(error);
        }
      }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DynamicEndpoints/documentation-mcp-using-pocketbase'

If you have feedback or need assistance with the MCP directory API, please join our Discord server