Skip to main content
Glama
jkingsman

https://github.com/jkingsman/qanon-mcp-server

word_cloud_by_date_range

Analyze QAnon posts by generating word clouds that visualize common terms used within specific date ranges for research purposes.

Instructions

Generate a word cloud analysis showing the most common words used in posts within a specified date range.

Args:
    start_date: Start date in YYYY-MM-DD format
    end_date: End date in YYYY-MM-DD format
    min_word_length: Minimum length of words to include (default: 3)
    max_words: Maximum number of words to return (default: 100)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
start_dateYes
end_dateYes
min_word_lengthNo
max_wordsNo

Implementation Reference

  • The main handler function for the 'word_cloud_by_date_range' tool. It filters posts by date range, extracts texts, calls generate_word_cloud helper, and formats the output. Registered via @mcp.tool() decorator.
    @mcp.tool()
    def word_cloud_by_date_range(
        start_date: str, end_date: str, min_word_length: int = 3, max_words: int = 100
    ) -> str:
        """
        Generate a word cloud analysis showing the most common words used in posts within a specified date range.
    
        Args:
            start_date: Start date in YYYY-MM-DD format
            end_date: End date in YYYY-MM-DD format
            min_word_length: Minimum length of words to include (default: 3)
            max_words: Maximum number of words to return (default: 100)
        """
        try:
            # Validate date format
            start_timestamp = int(datetime.strptime(start_date, "%Y-%m-%d").timestamp())
            end_timestamp = (
                int(datetime.strptime(end_date, "%Y-%m-%d").timestamp()) + 86400
            )  # Add a day in seconds
        except ValueError:
            return "Invalid date format. Please use YYYY-MM-DD format."
    
        # Collect posts within the date range
        selected_posts = []
        for post in posts:
            post_time = post.get("post_metadata", {}).get("time", 0)
            if start_timestamp <= post_time <= end_timestamp:
                selected_posts.append(post)
    
        if not selected_posts:
            return f"No posts found between {start_date} and {end_date}."
    
        # Extract post texts
        post_texts = [post.get("text", "") for post in selected_posts]
    
        # Generate word cloud
        cloud = generate_word_cloud(post_texts, min_word_length, max_words)
    
        # Get post ID range
        earliest_id = min(
            post.get("post_metadata", {}).get("id", 0) for post in selected_posts
        )
        latest_id = max(
            post.get("post_metadata", {}).get("id", 0) for post in selected_posts
        )
    
        result = f"Word Cloud Analysis for Date Range: {start_date} to {end_date}\n"
        result += f"Post ID Range: {earliest_id} to {latest_id}\n"
        result += f"Total Posts Analyzed: {len(selected_posts)}\n\n"
        result += cloud
    
        return result
  • Core helper function that processes post texts to generate word frequency counts, filters stopwords, and creates a visual word cloud representation. Called by the main handler.
    def generate_word_cloud(
        post_texts: List[str], min_word_length: int = 3, max_words: int = 100
    ) -> str:
        """
        Generate a word cloud analysis from a list of post texts.
    
        Args:
            post_texts: List of text content from posts
            min_word_length: Minimum length of words to include (default: 3)
            max_words: Maximum number of words to return (default: 100)
    
        Returns:
            Formatted string with word frequency analysis
        """
        # Common words to exclude (stopwords)
        stopwords = {
            "the",
            "and",
            "a",
            "to",
            "of",
            "in",
            "is",
            "that",
            "for",
            "on",
            "with",
            "as",
            "by",
            "at",
            "from",
            "be",
            "this",
            "was",
            "are",
            "an",
            "it",
            "not",
            "or",
            "have",
            "has",
            "had",
            "but",
            "what",
            "all",
            "were",
            "when",
            "there",
            "can",
            "been",
            "one",
            "do",
            "did",
            "who",
            "you",
            "your",
            "they",
            "their",
            "them",
            "will",
            "would",
            "could",
            "should",
            "which",
            "his",
            "her",
            "she",
            "he",
            "we",
            "our",
            "us",
            "i",
            "me",
            "my",
            "im",
            "ive",
            "myself",
            "its",
            "it's",
            "about",
            "some",
            "then",
            "than",
            "into",
        }
    
        # Combine all texts and replace literal \n with actual newlines
        combined_text = " ".join([text.replace("\\n", " ") for text in post_texts if text])
    
        # Remove URLs
        combined_text = re.sub(r"https?://\S+", "", combined_text)
    
        # Remove special characters and convert to lowercase
        combined_text = re.sub(r"[^\w\s]", " ", combined_text.lower())
    
        # Split into words and count frequencies
        words = combined_text.split()
    
        # Filter out stopwords and short words
        filtered_words = [
            word for word in words if word not in stopwords and len(word) >= min_word_length
        ]
    
        # Count word frequencies
        word_counts = Counter(filtered_words)
    
        # Get the most common words
        most_common = word_counts.most_common(max_words)
    
        # Format the result
        if not most_common:
            return "No significant words found in the selected posts."
    
        total_words = sum(count for _, count in most_common)
    
        result = f"Word Cloud Analysis (top {len(most_common)} words from {total_words} total filtered words):\n\n"
    
        # Calculate the maximum frequency for scaling
        max_freq = most_common[0][1]
    
        # Create a visual representation of word frequencies
        for word, count in most_common:
            # Calculate percentage of total
            percentage = (count / total_words) * 100
            # Scale the bar length
            bar_length = int((count / max_freq) * 30)
            bar = "█" * bar_length
            result += f"{word}: {count} ({percentage:.1f}%) {bar}\n"
    
        return result
  • Helper function to filter posts by date range, used internally by the tool (though the handler reimplements similar logic).
    def get_posts_by_date_range(start_date: str, end_date: str) -> List[Dict]:
        """Get posts within a date range (YYYY-MM-DD format)."""
        try:
            start_timestamp = int(datetime.strptime(start_date, "%Y-%m-%d").timestamp())
            end_timestamp = (
                int(datetime.strptime(end_date, "%Y-%m-%d").timestamp()) + 86400
            )  # Add a day in seconds
    
            results = []
            for post in posts:
                post_time = post.get("post_metadata", {}).get("time", 0)
                if start_timestamp <= post_time <= end_timestamp:
                    results.append(post)
            return results
        except ValueError:
            return []

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jkingsman/qanon-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server