Skip to main content
Glama
Sofias-ai

SharePoint MCP Server

by Sofias-ai

Get_Document_Content

Extract content from specific documents in SharePoint by specifying folder and file names, enabling direct interaction with stored data.

Instructions

Get content of a document in SharePoint

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_nameYes
folder_nameYes

Implementation Reference

  • Core implementation of get_document_content: downloads file from SharePoint, extracts text for supported formats (PDF, Excel, Word, text files), falls back to base64 for binary.
    def get_document_content(folder_name: str, file_name: str) -> dict:
        """Retrieve document content; supports PDF text extraction"""
        file_path = _get_sp_path(f"{folder_name}/{file_name}")
        file = sp_context.web.get_file_by_server_relative_url(file_path)
        sp_context.load(file, ["Exists", "Length", "Name"])
        sp_context.execute_query()
        logger.info(f"File exists: {file.exists}, size: {file.length}")
    
        content = io.BytesIO()
        file.download(content)
        sp_context.execute_query()
        content_bytes = content.getvalue()
        
        # Determine file type and process accordingly
        lower_name = file_name.lower()
        file_type = next((t for t, exts in FILE_TYPES.items() if any(lower_name.endswith(ext) for ext in exts)), 'binary')
        
        if file_type == 'pdf':
            try:
                text, pages = extract_text_from_pdf(content_bytes)
                return {"name": file_name, "content_type": "text", "content": text, "original_type": "pdf", "page_count": pages, "size": len(content_bytes)}
            except Exception as e:
                logger.warning(f"PDF processing failed: {e}")
                return {"name": file_name, "content_type": "binary", "content_base64": base64.b64encode(content_bytes).decode(), "original_type": "pdf", "size": len(content_bytes)}
        
        if file_type == 'excel':
            try:
                text, sheets = extract_text_from_excel(content_bytes)
                return {"name": file_name, "content_type": "text", "content": text, "original_type": "excel", "sheet_count": sheets, "size": len(content_bytes)}
            except Exception as e:
                logger.warning(f"Excel processing failed: {e}")
                return {"name": file_name, "content_type": "binary", "content_base64": base64.b64encode(content_bytes).decode(), "original_type": "excel", "size": len(content_bytes)}
        
        if file_type == 'word':
            try:
                text, paragraphs = extract_text_from_word(content_bytes)
                return {"name": file_name, "content_type": "text", "content": text, "original_type": "word", "paragraph_count": paragraphs, "size": len(content_bytes)}
            except Exception as e:
                logger.warning(f"Word processing failed: {e}")
                return {"name": file_name, "content_type": "binary", "content_base64": base64.b64encode(content_bytes).decode(), "original_type": "word", "size": len(content_bytes)}
        
        if file_type == 'text':
            try:
                return {"name": file_name, "content_type": "text", "content": content_bytes.decode('utf-8'), "size": len(content_bytes)}
            except UnicodeDecodeError:
                pass
        
        return {"name": file_name, "content_type": "binary", "content_base64": base64.b64encode(content_bytes).decode(), "size": len(content_bytes)}
  • MCP tool registration using @mcp.tool decorator. The handler function delegates to resources.get_document_content.
    @mcp.tool(name="Get_Document_Content", description="Get content of a document in SharePoint")
    async def get_document_content_tool(folder_name: str, file_name: str):
        """Get content of a document in SharePoint"""
        return get_document_content(folder_name, file_name)
Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Sofias-ai/mcp-sharepoint'

If you have feedback or need assistance with the MCP directory API, please join our Discord server