Skip to main content
Glama
jkingsman

https://github.com/jkingsman/qanon-mcp-server

word_cloud_by_post_ids

Analyze word frequency in QAnon posts by ID range to identify common terms and themes for sociological research.

Instructions

Generate a word cloud analysis showing the most common words used in posts within a specified ID range.

Args:
    start_id: Starting post ID
    end_id: Ending post ID
    min_word_length: Minimum length of words to include (default: 3)
    max_words: Maximum number of words to return (default: 100)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
start_idYes
end_idYes
min_word_lengthNo
max_wordsNo

Implementation Reference

  • The primary handler function for the 'word_cloud_by_post_ids' tool. It selects posts within the specified ID range, extracts their text, generates a word cloud using the helper function, and returns a formatted analysis including date range and post count. The @mcp.tool() decorator registers this function as an MCP tool.
    @mcp.tool()
    def word_cloud_by_post_ids(
        start_id: int, end_id: int, min_word_length: int = 3, max_words: int = 100
    ) -> str:
        """
        Generate a word cloud analysis showing the most common words used in posts within a specified ID range.
    
        Args:
            start_id: Starting post ID
            end_id: Ending post ID
            min_word_length: Minimum length of words to include (default: 3)
            max_words: Maximum number of words to return (default: 100)
        """
        if start_id > end_id:
            return "Error: start_id must be less than or equal to end_id."
    
        # Collect posts within the ID range
        selected_posts = []
        for post in posts:
            post_id = post.get("post_metadata", {}).get("id", 0)
            if start_id <= post_id <= end_id:
                selected_posts.append(post)
    
        if not selected_posts:
            return f"No posts found with IDs between {start_id} and {end_id}."
    
        # Extract post texts
        post_texts = [post.get("text", "") for post in selected_posts]
    
        # Generate word cloud
        cloud = generate_word_cloud(post_texts, min_word_length, max_words)
    
        # Add additional information
        earliest_id = min(
            post.get("post_metadata", {}).get("id", 0) for post in selected_posts
        )
        latest_id = max(
            post.get("post_metadata", {}).get("id", 0) for post in selected_posts
        )
    
        earliest_date = min(
            post.get("post_metadata", {}).get("time", 0) for post in selected_posts
        )
        latest_date = max(
            post.get("post_metadata", {}).get("time", 0) for post in selected_posts
        )
    
        earliest_date_str = (
            datetime.fromtimestamp(earliest_date).strftime("%Y-%m-%d")
            if earliest_date
            else "Unknown"
        )
        latest_date_str = (
            datetime.fromtimestamp(latest_date).strftime("%Y-%m-%d")
            if latest_date
            else "Unknown"
        )
    
        result = f"Word Cloud Analysis for Post IDs {earliest_id} to {latest_id}\n"
        result += f"Date Range: {earliest_date_str} to {latest_date_str}\n"
        result += f"Total Posts Analyzed: {len(selected_posts)}\n\n"
        result += cloud
    
        return result
  • Helper utility function that processes a list of post texts to generate word frequency statistics, excluding stopwords and short words, and formats the output as a textual word cloud with frequency bars and percentages.
    def generate_word_cloud(
        post_texts: List[str], min_word_length: int = 3, max_words: int = 100
    ) -> str:
        """
        Generate a word cloud analysis from a list of post texts.
    
        Args:
            post_texts: List of text content from posts
            min_word_length: Minimum length of words to include (default: 3)
            max_words: Maximum number of words to return (default: 100)
    
        Returns:
            Formatted string with word frequency analysis
        """
        # Common words to exclude (stopwords)
        stopwords = {
            "the",
            "and",
            "a",
            "to",
            "of",
            "in",
            "is",
            "that",
            "for",
            "on",
            "with",
            "as",
            "by",
            "at",
            "from",
            "be",
            "this",
            "was",
            "are",
            "an",
            "it",
            "not",
            "or",
            "have",
            "has",
            "had",
            "but",
            "what",
            "all",
            "were",
            "when",
            "there",
            "can",
            "been",
            "one",
            "do",
            "did",
            "who",
            "you",
            "your",
            "they",
            "their",
            "them",
            "will",
            "would",
            "could",
            "should",
            "which",
            "his",
            "her",
            "she",
            "he",
            "we",
            "our",
            "us",
            "i",
            "me",
            "my",
            "im",
            "ive",
            "myself",
            "its",
            "it's",
            "about",
            "some",
            "then",
            "than",
            "into",
        }
    
        # Combine all texts and replace literal \n with actual newlines
        combined_text = " ".join([text.replace("\\n", " ") for text in post_texts if text])
    
        # Remove URLs
        combined_text = re.sub(r"https?://\S+", "", combined_text)
    
        # Remove special characters and convert to lowercase
        combined_text = re.sub(r"[^\w\s]", " ", combined_text.lower())
    
        # Split into words and count frequencies
        words = combined_text.split()
    
        # Filter out stopwords and short words
        filtered_words = [
            word for word in words if word not in stopwords and len(word) >= min_word_length
        ]
    
        # Count word frequencies
        word_counts = Counter(filtered_words)
    
        # Get the most common words
        most_common = word_counts.most_common(max_words)
    
        # Format the result
        if not most_common:
            return "No significant words found in the selected posts."
    
        total_words = sum(count for _, count in most_common)
    
        result = f"Word Cloud Analysis (top {len(most_common)} words from {total_words} total filtered words):\n\n"
    
        # Calculate the maximum frequency for scaling
        max_freq = most_common[0][1]
    
        # Create a visual representation of word frequencies
        for word, count in most_common:
            # Calculate percentage of total
            percentage = (count / total_words) * 100
            # Scale the bar length
            bar_length = int((count / max_freq) * 30)
            bar = "█" * bar_length
            result += f"{word}: {count} ({percentage:.1f}%) {bar}\n"
    
        return result

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jkingsman/qanon-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server